面向多类别舰船多目标跟踪的改进CSTrack算法

袁志安,谷雨,马淦. 面向多类别舰船多目标跟踪的改进CSTrack算法[J]. 光电工程,2023,50(12): 230218. doi: 10.12086/oee.2023.230218
引用本文: 袁志安,谷雨,马淦. 面向多类别舰船多目标跟踪的改进CSTrack算法[J]. 光电工程,2023,50(12): 230218. doi: 10.12086/oee.2023.230218
Yuan Z A, Gu Y, Ma G. Improved CSTrack algorithm for multi-class ship multi-object tracking[J]. Opto-Electron Eng, 2023, 50(12): 230218. doi: 10.12086/oee.2023.230218
Citation: Yuan Z A, Gu Y, Ma G. Improved CSTrack algorithm for multi-class ship multi-object tracking[J]. Opto-Electron Eng, 2023, 50(12): 230218. doi: 10.12086/oee.2023.230218

面向多类别舰船多目标跟踪的改进CSTrack算法

  • 基金项目:
    浙江省自然科学基金资助项目(LY21F030010, LZ23F030002)
详细信息
    作者简介:
    *通讯作者: 谷雨,guyu@hdu.edu.cn
  • 中图分类号: TP391.41

Improved CSTrack algorithm for multi-class ship multi-object tracking

  • Fund Project: Project supported by Natural Science Foundation of Zhejiang Province (LY21F030010, LZ23F030002)
More Information
  • 针对海面舰船多目标跟踪过程中图像背景复杂、目标尺度差异大等难点,提出了一种改进CSTrack的舰船多目标跟踪算法。首先,针对CSTrack算法使用暴力解耦分解颈部特征造成目标特征损失的问题,提出了一种结合Res2net模块的改进互相关解耦网络RES_CCN,使网络解耦后获得更加细粒度的特征。其次,为提升对多类别舰船的跟踪性能,采用检测头网络解耦设计分别预测目标类别、置信度和位置。最后,采用MOT2016数据集进行消融实验,验证了所提模块的有效性,在新加坡海事数据集上进行测试,所提算法的多目标跟踪精度提升了8.4%,目标识别准确度提升了3.1%,优于ByteTrack等算法。本文所提算法具有跟踪精度高、误检率低等优点,适用于海面舰船多目标跟踪任务。

  • Overview: Ship multi-object tracking is an important application scenario in the field of multi-object tracking (MOT), and can be widely applied in both military and civilian fields. The objective of MOT is to locate multiple ship objects and maintain a unique identification (ID) number for each ship object, and record its continuous trajectory. The difficulty of MOT lies in the uncertainty of false positives, false negatives, ID switches, and object numbers. The feature maps obtained by the neck part of the network in CSTrack multi-object tracking algorithm are decomposed into two different feature vectors by decoupling, and are as the input of object detection and Re-identification networks respectively to alleviate the contradiction between these two tasks and improve the performance of multi-object tracking. However, this kind of violent decoupling will bring about the problem of object feature loss, which leads to the deterioration of tracking performance in the case of object occlusion, small objects, or dense objects. To solve this issue, an improved cross-correlation network (CCN) named RES_CCN which can extract fine-grained features is proposed in this paper. This network is composed of an improved Res2net network, coordinate attention, and CCN network, and is inserted between the neck and head modules of the network, so that more fine-grained features can be obtained by increasing the receptive field and inserting more hierarchical residual connection structures into the residual unit before feature decoupling. To meet the requirements of multi-class ship multi-object tracking and improve the detection performance of the algorithm, the decoupled design of the detection head network is used to predict class, confidence, and position of objects, respectively, and binary cross-entropy is used as class loss function and added to the total loss function. Finally, the ablation experimental results on the MOT2016 dataset show that the multiple object tracking accuracy (MOTA) of the proposed algorithm has an improvement of 4.6 compared with that of the original algorithm, and the identification F1 score (IDF1) is increased by 3.4. When tested on the Singapore maritime dataset, the MOTA of the proposed algorithm is improved by 8.4 compared with that of the original CSTrack, and IDF1 is increased by 3.1, which are better than the performance of ByteTrack and other algorithms. The qualitative experimental results show that the proposed algorithm can effectively detect small objects and maintain object IDs in sea-surface scenarios. The algorithm proposed in this paper has the characteristics of high tracking accuracy and low error detection rate, and is suitable for ship multi-object tracking in sea-surface scenarios.

  • 加载中
  • 图 1  JDE和CSTrack算法流程图。(a) JDE; (b) CSTrack

    Figure 1.  Flowchart of the JDE and CSTrack algorithms. (a) JDE; (b) CSTrack

    图 2  CSTrack网络结构。(a) 总体框架;(b) CCN和RES_CCN网络;(c) SAAN网络;(d) SAM网络;(e) CAM网络

    Figure 2.  Network architecture of the CSTrack. (a) Overall framework; (b) CCN and Res_CCN networks; (c) SAAN network; (d) SAM network; (e) CAM network

    图 3  本文方法总体框图和特征提取网络结构

    Figure 3.  Overall framework and feature extraction network architecture of the proposed method

    图 4  改进的Res2net网络结构

    Figure 4.  Network architecture of the improved Res2net

    图 5  CA网络结构

    Figure 5.  Network architecture of CA

    图 6  解耦检测头网络结构

    Figure 6.  Network architecture of decoupled head

    图 7  级联匹配流程图

    Figure 7.  Flowchart of matching cascade

    图 8  本文方法与基准方法在SMD验证集上的可视化结果对比。(a) 漏检和误检;(b) ID切换和漏检

    Figure 8.  Comparison of visualization results between our method and baseline on SMD validation set. (a) FN and FP; (b) ID switch and FN

    图 9  本文方法与基准方法在MOT验证集上的可视化结果对比。(a) 误检和漏检;(b) ID切换和特定的误检

    Figure 9.  Comparison of visualization results between our method and baseline on MOT validation set. (a) FP and FN; (b) ID switch and special FP

    表 1  调整的SMD数据集视频序列相关参数

    Table 1.  Adjusted SMD dataset video sequence related parameters

    SMD视频序列视频帧数FerryVessel-shipSpeed-boatBoatKayakSail-boat调整前调整后
    MVI_1448600-32101410---测试集-
    MVI_14744458903560----测试集-
    MVI_14846006001200----测试集-
    MVI_148660010234200----测试集-
    MVI_15825405405400----测试集-
    MVI_16122611652349----测试集-
    MVI_1626556-2775----测试集-
    MVI_1627600-4200----测试集-
    MVI_1640310-1677274---测试集-
    MVI_0797600-767----测试集-
    MVI_1587600-7800-600--训练集测试集
    MVI_15924914912347--791-训练集测试集
    MVI_1452340-1360---340训练集测试集
    MVI_1469600-3600941---验证集训练集
    MVI_1578505-3535----验证集训练集
    MVI_0790600-70-140--验证集训练集
    MVI_0799600-390170----训练集
    下载: 导出CSV

    表 2  MOT16数据集上不同模块对跟踪性能的影响

    Table 2.  Influence of different modules on the tracking performance on MOT16 dataset

    模型MOTA↑IDF1↑FP↓FN↓MT↑ML↓IDS↓
    Baseline79.477.962351558435429876
    Baseline+Res2net*82.879.747141396639021616
    Baseline+CA82.478.347761402237721642
    Baseline+检测头解耦82.779.246281431837528571
    Baseline*82.275.449271380138923875
    Baseline*+Res2net83.275.844591335039822758
    Baseline*+Res2net*83.180.844131372038523536
    Baseline*+ Res2net* +CA注意力机制(Baseline*+RES_CCN)83.481.943351343439318571
    Baseline*+ Res2net* +CA注意力机制+检测头解耦84.081.340001310740020480
    下载: 导出CSV

    表 3  不同注意力机制对跟踪性能的影响

    Table 3.  Influence of different attention mechanisms on tracking performance

    模型MOTA↑IDF1↑FP↓FN↓MT↑ML↓IDS↓
    SE83.078.646241355739418589
    CBAM83.680.842291340239120491
    ECA80.579.333161780635129489
    CA84.081.340001310740020480
    下载: 导出CSV

    表 4  ReID权重参数对跟踪性能的影响

    Table 4.  Influence of ReID weight parameters on tracking performance

    ReID权重参数MOTA↑IDF1↑FP↓FN↓MT↑ML↓IDS↓
    4X10-280.183.046851448837428576
    4X10-381.182.644161368738823530
    4X10-484.081.340001310740020480
    4X10-583.580.243191337939622530
    下载: 导出CSV

    表 5  本文方法与其他先进方法在SMD数据集上跟踪表现的对比结果

    Table 5.  Comparison of tracking performance between the proposed method and other state-of-the-art methods on SMD dataset

    算法MOTA↑IDF1↑FP↓FN↓MT↑ML↓IDS↓
    DeepSORT31.162.321678110826925224
    StrongSORT42.16513264172336321224
    ByteTrack44.867.3938717003572649
    CSTrack38.562.69760196174833109
    本文方法46.965.76658165654323172
    下载: 导出CSV
  • [1]

    Ciaparrone G, Sánchez F L, Tabik S, et al. Deep learning in video multi-object tracking: a survey[J]. Neurocomputing, 2020, 381: 61−88. doi: 10.1016/j.neucom.2019.11.023

    [2]

    伍瀚, 聂佳浩, 张照娓, 等. 基于深度学习的视觉多目标跟踪研究综述[J]. 计算机科学, 2023, 50(4): 77−87. doi: 10.11896/jsjkx.220300173

    Wu H, Lie J H, Zhang Z W, et al. Deep learning-based visual multiple object tracking: a review[J]. Comput Sci, 2023, 50(4): 77−87. doi: 10.11896/jsjkx.220300173

    [3]

    Wang G A, Song M L, Hwang J N. Recent advances in embedding methods for multi-object tracking: a survey[Z]. arXiv: 2205.10766, 2022. https://doi.org/10.48550/arXiv.2205.10766.

    [4]

    Xiao T, Li S, Wang B C, et al. Joint detection and identification feature learning for person search[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 2017: 3376–3385.https://doi.org/10.1109/CVPR.2017.360.

    [5]

    Wojke N, Bewley A, Paulus D. Simple online and realtime tracking with a deep association metric[C]//Proceedings of 2017 IEEE International Conference on Image Processing, Beijing, 2017: 3645–3649. https://doi.org/10.1109/ICIP.2017.8296962.

    [6]

    Du Y H, Zhao Z C, Song Y, et al. StrongSORT: make deepSORT great again[Z]. arXiv: 2202.13514, 2023. https://doi.org/10.48550/arXiv.2202.13514.

    [7]

    Zhang Y F, Sun P Z, Jiang Y, et al. Bytetrack: multi-object tracking by associating every detection box[C]//Proceedings of the 17th European Conference on Computer Vision, Tel Aviv, 2022: 1–21. https://doi.org/10.1007/978-3-031-20047-2_1.

    [8]

    Zhang Y F, Wang C Y, Wang X G, et al. FairMOT: on the fairness of detection and re-identification in multiple object tracking[J]. Int J Comput Vis, 2021, 129(11): 3069−3087. doi: 10.1007/s11263-021-01513-4

    [9]

    Liang C, Zhang Z P, Zhou X, et al. Rethinking the competition between detection and ReID in multiobject tracking[J]. IEEE Trans Image Process, 2022, 31: 3182−3196. doi: 10.1109/TIP.2022.3165376

    [10]

    Prasad D K, Rajan D, Rachmawati L, et al. Video processing from electro-optical sensors for object detection and tracking in a maritime environment: a survey[J]. IEEE Trans Intell Transp Syst, 2017, 18(8): 1993−2016. doi: 10.1109/TITS.2016.2634580

    [11]

    Milan A, Leal-Taixé L, Reid I, et al. MOT16: a benchmark for multi-object tracking[Z]. arXiv: 1603.00831, 2016. https://doi.org/10.48550/arXiv.1603.00831.

    [12]

    Wu J L, Cao J L, Song L C, et al. Track to detect and segment: an online multi-object tracker[C]//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, 2021: 12347–12356. https://doi.org/10.1109/CVPR46437.2021.01217.

    [13]

    Ren S Q, He K M, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Trans Pattern Anal Mach Intell, 2017, 39(6): 1137−1149. doi: 10.1109/TPAMI.2016.2577031

    [14]

    Wang Z D, Zheng L, Liu Y X, et al. Towards real-time multi-object tracking[C]//Proceedings of the 16th European Conference on Computer Vision, Glasgow, 2020: 107–122. https://doi.org/10.1007/978-3-030-58621-8_7.

    [15]

    Yu E, Li Z L, Han S D, et al. RelationTrack: relation-aware multiple object tracking with decoupled representation[J]. IEEE Trans Multimedia, 2022, 25: 2686−2697. doi: 10.1109/TMM.2022.3150169

    [16]

    Wan X Y, Zhou S P, Wang J J, et al. Multiple object tracking by trajectory map regression with temporal priors embedding[C]//Proceedings of the 29th ACM International Conference on Multimedia, 2021: 1377–1386. https://doi.org/10.1145/3474085.3475304.

    [17]

    Meng F J, Wang X Q, Wang D, et al. Spatial–semantic and temporal attention mechanism-based online multi-object tracking[J]. Sensors, 2020, 20(6): 1653. doi: 10.3390/s20061653

    [18]

    Guo S, Wang J Y, Wang X C, et al. Online multiple object tracking with cross-task synergy[C]//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, 2021: 8132–8141. https://doi.org/10.1109/CVPR46437.2021.00804.

    [19]

    Bloisi D D, Iocchi L, Pennisi A, et al. ARGOS-Venice boat classification[C]//Proceedings of the 2015 12th IEEE International Conference on Advanced Video and Signal Based Surveillance, Karlsruhe, 2015: 1–6. https://doi.org/10.1109/AVSS.2015.7301727.

    [20]

    Shao Z F, Wu W J, Wang Z Y, et al. SeaShips: a large-scale precisely annotated dataset for ship detection[J]. IEEE Trans Multimedia, 2018, 20(10): 2593−2604. doi: 10.1109/TMM.2018.2865686

    [21]

    Ribeiro R, Cruz G, Matos J, et al. A data set for airborne maritime surveillance environments[J]. IEEE Trans Circuits Syst Video Technol, 2017, 29(9): 2720−2732. doi: 10.1109/TCSVT.2017.2775524

    [22]

    徐安林, 杜丹, 王海红, 等. 结合层次化搜索与视觉残差网络的光学舰船目标检测方法[J]. 光电工程, 2021, 48(4): 200249. doi: 10.12086/oee.2021.200249

    Xu A L, Du D, Wang H H, et al. Optical ship target detection method combining hierarchical search and visual residual network[J]. Opto-Electron Eng, 2021, 48(4): 200249. doi: 10.12086/oee.2021.200249

    [23]

    于国莉, 桑金歌, 李俊荣. 基于改进卷积神经网络的舰船实时目标跟踪识别技术[J]. 舰船科学技术, 2022, 44(21): 152−155. doi: 10.3404/j.issn.1672-7649.2022.21.031

    Yu G L, Sang J G, Li J R. Ship real-time target tracking and recognition technology based on improved convolutional neural network[J]. Ship Sci Technol, 2022, 44(21): 152−155. doi: 10.3404/j.issn.1672-7649.2022.21.031

    [24]

    Li G Y, Qiao Y L. A ship target detection and tracking algorithm based on graph matching[J]. J Phys Conf Ser, 2021, 1873: 012056. doi: 10.1088/1742-6596/1873/1/012056

    [25]

    周越冬. 基于深度学习的遥感图像舰船多目标跟踪方法研究[D]. 西安: 西安电子科技大学, 2021.https://doi.org/10.27389/d.cnki.gxadu.2021.000391.

    Zhou Y D. Research on ship multiple object tracking in remote sensing image based on deep learning[D]. Xi’an: Xidian University, 2021. https://doi.org/10.27389/d.cnki.gxadu.2021.000391.

    [26]

    陈庆林. 面向舰船视频目标检测的标注与多目标跟踪算法研究[D]. 杭州: 杭州电子科技大学, 2021. https://doi.org/10.27075/d.cnki.ghzdc.2021.000349.

    Chen Q L. Research on automatic annotation and multi-target tracking algorithm for ship video target detection[D]. Hangzhou: Hangzhou Dianzi University, 2021. https://doi.org/10.27075/d.cnki.ghzdc.2021.000349.

    [27]

    陈旭, 彭冬亮, 谷雨. 基于改进YOLOv5s的无人机图像实时目标检测[J]. 光电工程, 2022, 49(3): 210372. doi: 10.12086/oee.2022.210372

    Chen X, Peng D L, Gu Y. Real-time object detection for UAV images based on improved YOLOv5s[J]. Opto-Electron Eng, 2022, 49(3): 210372. doi: 10.12086/oee.2022.210372

    [28]

    Gao S H, Cheng M M, Zhao K, et al. Res2Net: a new multi-scale backbone architecture[J]. IEEE Trans Pattern Anal Mach Intell, 2019, 43(2): 652−662. doi: 10.1109/TPAMI.2019.2938758

    [29]

    Hou Q B, Zhou D Q, Feng J S. Coordinate attention for efficient mobile network design[C]//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, 2021: 13708–13717. https://doi.org/10.1109/CVPR46437.2021.01350.

    [30]

    Hu J, Shen L, Sun G. Squeeze-and-excitation networks[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 2018: 7132–7141. https://doi.org/10.1109/CVPR.2018.00745.

    [31]

    Woo S, Park J, Lee J Y, et al. CBAM: convolutional block attention module[C]//Proceedings of the 15th European Conference on Computer Vision, Munich, 2018: 3–19. https://doi.org/10.1007/978-3-030-01234-2_1.

    [32]

    Wang Q L, Wu B G, Zhu P F, et al. ECA-Net: efficient channel attention for deep convolutional neural networks[C]//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, 2020: 11531–11539. https://doi.org/10.1109/CVPR42600.2020.01155.

    [33]

    Li C Y, Li L L, Jiang H L, et al. YOLOv6: a single-stage object detection framework for industrial applications[Z]. arXiv: 2209.02976, 2022. https://doi.org/10.48550/arXiv.2209.02976.

    [34]

    Ge Z, Liu S T, Wang F, et al. YOLOX: exceeding YOLO series in 2021[Z]. arXiv: 2107.08430, 2021. https://doi.org/10.48550/arXiv.2107.08430.

    [35]

    Moosbauer S, König D, Jäkel J, et al. A benchmark for deep learning based object detection in maritime environments[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Long Beach, 2019: 916–925. https://doi.org/10.1109/CVPRW.2019.00121.

  • 加载中

(10)

(5)

计量
  • 文章访问数: 
  • PDF下载数: 
  • 施引文献:  0
出版历程
收稿日期:  2023-09-01
修回日期:  2023-12-04
录用日期:  2023-12-05
刊出日期:  2024-01-19

目录

/

返回文章
返回