面向多类别舰船多目标跟踪的改进CSTrack算法

袁志安; 谷雨; 马淦

doi:10.12086/oee.2023.230218

面向多类别舰船多目标跟踪的改进CSTrack算法

- 杭州电子科技大学自动化学院，浙江杭州 310018
基金项目:
浙江省自然科学基金资助项目(LY21F030010, LZ23F030002)

详细信息

作者简介:
袁志安(1998-)，男，硕士研究生，主要从事图像目标识别、多目标跟踪的研究。E-mail：1426771675@qq.com;

谷雨(1982-)，男，博士，副教授，主要从事多源信息融合、遥感图像处理、目标检测、识别与跟踪等方面的研究。E-mail: guyu@hdu.edu.cn

**^*通讯作者:** 谷雨，guyu@hdu.edu.cn

中图分类号: TP391.41

收稿日期: 2023-09-01

修回日期: 2023-12-04

录用日期: 2023-12-05

刊出日期: 2024-01-19

Improved CSTrack algorithm for multi-class ship multi-object tracking

- School of Automation, Hangzhou Dianzi University, Hangzhou, Zhejiang 310018, China
Fund Project: Project supported by Natural Science Foundation of Zhejiang Province (LY21F030010, LZ23F030002)

More Information

**^*Corresponding author:** guyu@hdu.edu.cn

Received Date 01 September 2023

Revised Date 04 December 2023

Accepted Date 05 December 2023

Published Date 19 January 2024

摘要

摘要:
针对海面舰船多目标跟踪过程中图像背景复杂、目标尺度差异大等难点，提出了一种改进CSTrack的舰船多目标跟踪算法。首先，针对CSTrack算法使用暴力解耦分解颈部特征造成目标特征损失的问题，提出了一种结合Res2net模块的改进互相关解耦网络RES_CCN，使网络解耦后获得更加细粒度的特征。其次，为提升对多类别舰船的跟踪性能，采用检测头网络解耦设计分别预测目标类别、置信度和位置。最后，采用MOT2016数据集进行消融实验，验证了所提模块的有效性，在新加坡海事数据集上进行测试，所提算法的多目标跟踪精度提升了8.4%，目标识别准确度提升了3.1%，优于ByteTrack等算法。本文所提算法具有跟踪精度高、误检率低等优点，适用于海面舰船多目标跟踪任务。
- 多目标跟踪 /
- 目标重识别 /
- 目标检测 /
- 细粒度特征 /
- 注意力机制
Abstract:
Due to the difficulties of complex backgrounds and large-scale differences between objects during the process of ship multi-object tracking in sea-surface scenarios, an improved CSTrack algorithm for ship multi-object tracking is proposed in this paper. Firstly, as violent decoupling is used in the CSTrack algorithm to decompose neck features and cause object feature loss, an improved cross-correlation decoupling network that combines the Res2net module (RES_CCN) is proposed, and thus more fine-grained features can be obtained. Secondly, to improve the tracking performance of multi-class ships, the decoupled design of the detection head network is used to predict the class, confidence, and position of objects, respectively. Finally, the MOT2016 dataset is used for the ablation experiment to verify the effectiveness of the proposed module. When tested on the Singapore maritime dataset, the multiple object tracking accuracy of the proposed algorithm is improved by 8.4% and the identification F1 score is increased by 3.1%, which are better than those of the ByteTrack and other algorithms. The proposed algorithm has the advantages of high tracking accuracy and low error detection rate and is suitable for ship multi-object tracking in sea-surface scenarios.
- multi-object tracking /
- re-identification /
- object detection /
- fine-grained feature /
- attention mechanism

Overview

Overview: Ship multi-object tracking is an important application scenario in the field of multi-object tracking (MOT), and can be widely applied in both military and civilian fields. The objective of MOT is to locate multiple ship objects and maintain a unique identification (ID) number for each ship object, and record its continuous trajectory. The difficulty of MOT lies in the uncertainty of false positives, false negatives, ID switches, and object numbers. The feature maps obtained by the neck part of the network in CSTrack multi-object tracking algorithm are decomposed into two different feature vectors by decoupling, and are as the input of object detection and Re-identification networks respectively to alleviate the contradiction between these two tasks and improve the performance of multi-object tracking. However, this kind of violent decoupling will bring about the problem of object feature loss, which leads to the deterioration of tracking performance in the case of object occlusion, small objects, or dense objects. To solve this issue, an improved cross-correlation network (CCN) named RES_CCN which can extract fine-grained features is proposed in this paper. This network is composed of an improved Res2net network, coordinate attention, and CCN network, and is inserted between the neck and head modules of the network, so that more fine-grained features can be obtained by increasing the receptive field and inserting more hierarchical residual connection structures into the residual unit before feature decoupling. To meet the requirements of multi-class ship multi-object tracking and improve the detection performance of the algorithm, the decoupled design of the detection head network is used to predict class, confidence, and position of objects, respectively, and binary cross-entropy is used as class loss function and added to the total loss function. Finally, the ablation experimental results on the MOT2016 dataset show that the multiple object tracking accuracy (MOTA) of the proposed algorithm has an improvement of 4.6 compared with that of the original algorithm, and the identification F1 score (IDF1) is increased by 3.4. When tested on the Singapore maritime dataset, the MOTA of the proposed algorithm is improved by 8.4 compared with that of the original CSTrack, and IDF1 is increased by 3.1, which are better than the performance of ByteTrack and other algorithms. The qualitative experimental results show that the proposed algorithm can effectively detect small objects and maintain object IDs in sea-surface scenarios. The algorithm proposed in this paper has the characteristics of high tracking accuracy and low error detection rate, and is suitable for ship multi-object tracking in sea-surface scenarios.

HTML全文

图 1 JDE和CSTrack算法流程图。(a) JDE; (b) CSTrack

Figure 1. Flowchart of the JDE and CSTrack algorithms. (a) JDE; (b) CSTrack

下载: 全尺寸图片幻灯片

图 2 CSTrack网络结构。(a) 总体框架；(b) CCN和RES_CCN网络；(c) SAAN网络；(d) SAM网络；(e) CAM网络

Figure 2. Network architecture of the CSTrack. (a) Overall framework; (b) CCN and Res_CCN networks; (c) SAAN network; (d) SAM network; (e) CAM network

下载: 全尺寸图片幻灯片

图 3 本文方法总体框图和特征提取网络结构

Figure 3. Overall framework and feature extraction network architecture of the proposed method

下载: 全尺寸图片幻灯片

图 4 改进的Res2net网络结构

Figure 4. Network architecture of the improved Res2net

下载: 全尺寸图片幻灯片

图 5 CA网络结构

Figure 5. Network architecture of CA

下载: 全尺寸图片幻灯片

图 6 解耦检测头网络结构

Figure 6. Network architecture of decoupled head

下载: 全尺寸图片幻灯片

图 7 级联匹配流程图

Figure 7. Flowchart of matching cascade

下载: 全尺寸图片幻灯片

图 8 本文方法与基准方法在SMD验证集上的可视化结果对比。(a) 漏检和误检；(b) ID切换和漏检

Figure 8. Comparison of visualization results between our method and baseline on SMD validation set. (a) FN and FP; (b) ID switch and FN

下载: 全尺寸图片幻灯片

图 9 本文方法与基准方法在MOT验证集上的可视化结果对比。(a) 误检和漏检；(b) ID切换和特定的误检

Figure 9. Comparison of visualization results between our method and baseline on MOT validation set. (a) FP and FN; (b) ID switch and special FP

下载: 全尺寸图片幻灯片

表 1 调整的SMD数据集视频序列相关参数

Table 1. Adjusted SMD dataset video sequence related parameters

SMD视频序列	视频帧数	Ferry	Vessel-ship	Speed-boat	Boat	Kayak	Sail-boat	调整前	调整后
MVI_1448	600	-	3210	1410	-	-	-	测试集	-
MVI_1474	445	890	3560	-	-	-	-	测试集	-
MVI_1484	600	600	1200	-	-	-	-	测试集	-
MVI_1486	600	1023	4200	-	-	-	-	测试集	-
MVI_1582	540	540	5400	-	-	-	-	测试集	-
MVI_1612	261	165	2349	-	-	-	-	测试集	-
MVI_1626	556	-	2775	-	-	-	-	测试集	-
MVI_1627	600	-	4200	-	-	-	-	测试集	-
MVI_1640	310	-	1677	274	-	-	-	测试集	-
MVI_0797	600	-	767	-	-	-	-	测试集	-
MVI_1587	600	-	7800	-	600	-	-	训练集	测试集
MVI_1592	491	491	2347	-	-	791	-	训练集	测试集
MVI_1452	340	-	1360	-	-	-	340	训练集	测试集
MVI_1469	600	-	3600	941	-	-	-	验证集	训练集
MVI_1578	505	-	3535	-	-	-	-	验证集	训练集
MVI_0790	600	-	70	-	140	-	-	验证集	训练集
MVI_0799	600	-	390	170	-	-	-	-	训练集

下载: 导出CSV

表 2 MOT16数据集上不同模块对跟踪性能的影响

Table 2. Influence of different modules on the tracking performance on MOT16 dataset

模型	MOTA↑	IDF1↑	FP↓	FN↓	MT↑	ML↓	IDS↓
Baseline	79.4	77.9	6235	15584	354	29	876
Baseline+Res2net*	82.8	79.7	4714	13966	390	21	616
Baseline+CA	82.4	78.3	4776	14022	377	21	642
Baseline+检测头解耦	82.7	79.2	4628	14318	375	28	571
Baseline*	82.2	75.4	4927	13801	389	23	875
Baseline*+Res2net	83.2	75.8	4459	13350	398	22	758
Baseline+Res2net	83.1	80.8	4413	13720	385	23	536
Baseline+ Res2net +CA注意力机制(Baseline*+RES_CCN)	83.4	81.9	4335	13434	393	18	571
Baseline+ Res2net +CA注意力机制+检测头解耦	84.0	81.3	4000	13107	400	20	480

下载: 导出CSV

表 3 不同注意力机制对跟踪性能的影响

Table 3. Influence of different attention mechanisms on tracking performance

模型	MOTA↑	IDF1↑	FP↓	FN↓	MT↑	ML↓	IDS↓
SE	83.0	78.6	4624	13557	394	18	589
CBAM	83.6	80.8	4229	13402	391	20	491
ECA	80.5	79.3	3316	17806	351	29	489
CA	84.0	81.3	4000	13107	400	20	480

下载: 导出CSV

表 4 ReID权重参数对跟踪性能的影响

Table 4. Influence of ReID weight parameters on tracking performance

ReID权重参数	MOTA↑	IDF1↑	FP↓	FN↓	MT↑	ML↓	IDS↓
4X10^-2	80.1	83.0	4685	14488	374	28	576
4X10^-3	81.1	82.6	4416	13687	388	23	530
4X10^-4	84.0	81.3	4000	13107	400	20	480
4X10^-5	83.5	80.2	4319	13379	396	22	530

下载: 导出CSV

表 5 本文方法与其他先进方法在SMD数据集上跟踪表现的对比结果

Table 5. Comparison of tracking performance between the proposed method and other state-of-the-art methods on SMD dataset

算法	MOTA↑	IDF1↑	FP↓	FN↓	MT↑	ML↓	IDS↓
DeepSORT	31.1	62.3	21678	11082	69	25	224
StrongSORT	42.1	65	13264	17233	63	21	224
ByteTrack	44.8	67.3	9387	17003	57	26	49
CSTrack	38.5	62.6	9760	19617	48	33	109
本文方法	46.9	65.7	6658	16565	43	23	172

下载: 导出CSV

参考文献(35)

[1]	Ciaparrone G, Sánchez F L, Tabik S, et al. Deep learning in video multi-object tracking: a survey[J]. Neurocomputing, 2020, 381: 61−88. doi: 10.1016/j.neucom.2019.11.023
[2]	伍瀚, 聂佳浩, 张照娓, 等. 基于深度学习的视觉多目标跟踪研究综述[J]. 计算机科学, 2023, 50(4): 77−87. doi: 10.11896/jsjkx.220300173 Wu H, Lie J H, Zhang Z W, et al. Deep learning-based visual multiple object tracking: a review[J]. Comput Sci, 2023, 50(4): 77−87. doi: 10.11896/jsjkx.220300173
[3]	Wang G A, Song M L, Hwang J N. Recent advances in embedding methods for multi-object tracking: a survey[Z]. arXiv: 2205.10766, 2022. https://doi.org/10.48550/arXiv.2205.10766.
[4]	Xiao T, Li S, Wang B C, et al. Joint detection and identification feature learning for person search[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 2017: 3376–3385.https://doi.org/10.1109/CVPR.2017.360.
[5]	Wojke N, Bewley A, Paulus D. Simple online and realtime tracking with a deep association metric[C]//Proceedings of 2017 IEEE International Conference on Image Processing, Beijing, 2017: 3645–3649. https://doi.org/10.1109/ICIP.2017.8296962.
[6]	Du Y H, Zhao Z C, Song Y, et al. StrongSORT: make deepSORT great again[Z]. arXiv: 2202.13514, 2023. https://doi.org/10.48550/arXiv.2202.13514.
[7]	Zhang Y F, Sun P Z, Jiang Y, et al. Bytetrack: multi-object tracking by associating every detection box[C]//Proceedings of the 17th European Conference on Computer Vision, Tel Aviv, 2022: 1–21. https://doi.org/10.1007/978-3-031-20047-2_1.
[8]	Zhang Y F, Wang C Y, Wang X G, et al. FairMOT: on the fairness of detection and re-identification in multiple object tracking[J]. Int J Comput Vis, 2021, 129(11): 3069−3087. doi: 10.1007/s11263-021-01513-4
[9]	Liang C, Zhang Z P, Zhou X, et al. Rethinking the competition between detection and ReID in multiobject tracking[J]. IEEE Trans Image Process, 2022, 31: 3182−3196. doi: 10.1109/TIP.2022.3165376
[10]	Prasad D K, Rajan D, Rachmawati L, et al. Video processing from electro-optical sensors for object detection and tracking in a maritime environment: a survey[J]. IEEE Trans Intell Transp Syst, 2017, 18(8): 1993−2016. doi: 10.1109/TITS.2016.2634580
[11]	Milan A, Leal-Taixé L, Reid I, et al. MOT16: a benchmark for multi-object tracking[Z]. arXiv: 1603.00831, 2016. https://doi.org/10.48550/arXiv.1603.00831.
[12]	Wu J L, Cao J L, Song L C, et al. Track to detect and segment: an online multi-object tracker[C]//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, 2021: 12347–12356. https://doi.org/10.1109/CVPR46437.2021.01217.
[13]	Ren S Q, He K M, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Trans Pattern Anal Mach Intell, 2017, 39(6): 1137−1149. doi: 10.1109/TPAMI.2016.2577031
[14]	Wang Z D, Zheng L, Liu Y X, et al. Towards real-time multi-object tracking[C]//Proceedings of the 16th European Conference on Computer Vision, Glasgow, 2020: 107–122. https://doi.org/10.1007/978-3-030-58621-8_7.
[15]	Yu E, Li Z L, Han S D, et al. RelationTrack: relation-aware multiple object tracking with decoupled representation[J]. IEEE Trans Multimedia, 2022, 25: 2686−2697. doi: 10.1109/TMM.2022.3150169
[16]	Wan X Y, Zhou S P, Wang J J, et al. Multiple object tracking by trajectory map regression with temporal priors embedding[C]//Proceedings of the 29th ACM International Conference on Multimedia, 2021: 1377–1386. https://doi.org/10.1145/3474085.3475304.
[17]	Meng F J, Wang X Q, Wang D, et al. Spatial–semantic and temporal attention mechanism-based online multi-object tracking[J]. Sensors, 2020, 20(6): 1653. doi: 10.3390/s20061653
[18]	Guo S, Wang J Y, Wang X C, et al. Online multiple object tracking with cross-task synergy[C]//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, 2021: 8132–8141. https://doi.org/10.1109/CVPR46437.2021.00804.
[19]	Bloisi D D, Iocchi L, Pennisi A, et al. ARGOS-Venice boat classification[C]//Proceedings of the 2015 12th IEEE International Conference on Advanced Video and Signal Based Surveillance, Karlsruhe, 2015: 1–6. https://doi.org/10.1109/AVSS.2015.7301727.
[20]	Shao Z F, Wu W J, Wang Z Y, et al. SeaShips: a large-scale precisely annotated dataset for ship detection[J]. IEEE Trans Multimedia, 2018, 20(10): 2593−2604. doi: 10.1109/TMM.2018.2865686
[21]	Ribeiro R, Cruz G, Matos J, et al. A data set for airborne maritime surveillance environments[J]. IEEE Trans Circuits Syst Video Technol, 2017, 29(9): 2720−2732. doi: 10.1109/TCSVT.2017.2775524
[22]	徐安林, 杜丹, 王海红, 等. 结合层次化搜索与视觉残差网络的光学舰船目标检测方法[J]. 光电工程, 2021, 48(4): 200249. doi: 10.12086/oee.2021.200249 Xu A L, Du D, Wang H H, et al. Optical ship target detection method combining hierarchical search and visual residual network[J]. Opto-Electron Eng, 2021, 48(4): 200249. doi: 10.12086/oee.2021.200249
[23]	于国莉, 桑金歌, 李俊荣. 基于改进卷积神经网络的舰船实时目标跟踪识别技术[J]. 舰船科学技术, 2022, 44(21): 152−155. doi: 10.3404/j.issn.1672-7649.2022.21.031 Yu G L, Sang J G, Li J R. Ship real-time target tracking and recognition technology based on improved convolutional neural network[J]. Ship Sci Technol, 2022, 44(21): 152−155. doi: 10.3404/j.issn.1672-7649.2022.21.031
[24]	Li G Y, Qiao Y L. A ship target detection and tracking algorithm based on graph matching[J]. J Phys Conf Ser, 2021, 1873: 012056. doi: 10.1088/1742-6596/1873/1/012056
[25]	周越冬. 基于深度学习的遥感图像舰船多目标跟踪方法研究[D]. 西安: 西安电子科技大学, 2021.https://doi.org/10.27389/d.cnki.gxadu.2021.000391. Zhou Y D. Research on ship multiple object tracking in remote sensing image based on deep learning[D]. Xi’an: Xidian University, 2021. https://doi.org/10.27389/d.cnki.gxadu.2021.000391.
[26]	陈庆林. 面向舰船视频目标检测的标注与多目标跟踪算法研究[D]. 杭州: 杭州电子科技大学, 2021. https://doi.org/10.27075/d.cnki.ghzdc.2021.000349. Chen Q L. Research on automatic annotation and multi-target tracking algorithm for ship video target detection[D]. Hangzhou: Hangzhou Dianzi University, 2021. https://doi.org/10.27075/d.cnki.ghzdc.2021.000349.
[27]	陈旭, 彭冬亮, 谷雨. 基于改进YOLOv5s的无人机图像实时目标检测[J]. 光电工程, 2022, 49(3): 210372. doi: 10.12086/oee.2022.210372 Chen X, Peng D L, Gu Y. Real-time object detection for UAV images based on improved YOLOv5s[J]. Opto-Electron Eng, 2022, 49(3): 210372. doi: 10.12086/oee.2022.210372
[28]	Gao S H, Cheng M M, Zhao K, et al. Res2Net: a new multi-scale backbone architecture[J]. IEEE Trans Pattern Anal Mach Intell, 2019, 43(2): 652−662. doi: 10.1109/TPAMI.2019.2938758
[29]	Hou Q B, Zhou D Q, Feng J S. Coordinate attention for efficient mobile network design[C]//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, 2021: 13708–13717. https://doi.org/10.1109/CVPR46437.2021.01350.
[30]	Hu J, Shen L, Sun G. Squeeze-and-excitation networks[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 2018: 7132–7141. https://doi.org/10.1109/CVPR.2018.00745.
[31]	Woo S, Park J, Lee J Y, et al. CBAM: convolutional block attention module[C]//Proceedings of the 15th European Conference on Computer Vision, Munich, 2018: 3–19. https://doi.org/10.1007/978-3-030-01234-2_1.
[32]	Wang Q L, Wu B G, Zhu P F, et al. ECA-Net: efficient channel attention for deep convolutional neural networks[C]//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, 2020: 11531–11539. https://doi.org/10.1109/CVPR42600.2020.01155.
[33]	Li C Y, Li L L, Jiang H L, et al. YOLOv6: a single-stage object detection framework for industrial applications[Z]. arXiv: 2209.02976, 2022. https://doi.org/10.48550/arXiv.2209.02976.
[34]	Ge Z, Liu S T, Wang F, et al. YOLOX: exceeding YOLO series in 2021[Z]. arXiv: 2107.08430, 2021. https://doi.org/10.48550/arXiv.2107.08430.
[35]	Moosbauer S, König D, Jäkel J, et al. A benchmark for deep learning based object detection in maritime environments[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Long Beach, 2019: 916–925. https://doi.org/10.1109/CVPRW.2019.00121.

施引文献

资源附件(0)

访问统计

点击扫一扫

图(10)

表(5)

计量

文章访问数:
PDF下载数:
施引文献: 0

面向多类别舰船多目标跟踪的改进CSTrack算法

**^*通讯作者:** 谷雨，guyu@hdu.edu.cn

Improved CSTrack algorithm for multi-class ship multi-object tracking

**^*Corresponding author:** guyu@hdu.edu.cn

计量

目录

作者须知

其他内容

条款和政策

面向多类别舰船多目标跟踪的改进CSTrack算法

*通讯作者: 谷雨，guyu@hdu.edu.cn

Improved CSTrack algorithm for multi-class ship multi-object tracking

*Corresponding author: guyu@hdu.edu.cn

计量

出版历程

目录

作者须知

其他内容

条款和政策

**^*通讯作者:** 谷雨，guyu@hdu.edu.cn

**^*Corresponding author:** guyu@hdu.edu.cn