基于在线学习的Siamese网络视觉跟踪算法

张成煜; 侯志强; 蒲磊; 陈立琳; 马素刚; 余旺盛

doi:10.12086/oee.2021.200140

基于在线学习的Siamese网络视觉跟踪算法

- 1.
  西安邮电大学计算机学院，陕西西安 710121
- 2.
  西安邮电大学陕西省网络数据分析与智能处理重点实验室，陕西西安 710121
- 3.
  空军工程大学信息与导航学院，陕西西安 710077
基金项目:
国家自然科学基金资助项目(61473309，61703423)

详细信息

作者简介:
张成煜(1995-)，男，硕士研究生，主要从事计算机视觉、视觉跟踪的研究。E-mail：chengy_z@163.com

**^*通讯作者:** 侯志强(1973-)，男，博士，教授，博士生导师，主要从事图像处理、计算机视觉和信息融合的研究. E-mail: hou_qz@163.com

中图分类号: TP391

收稿日期: 2020-04-24

修回日期: 2020-10-20

刊出日期: 2021-04-15

Siamese network visual tracking algorithm based on online learning

- 1.
  Institute of Computer, Xi'an University of Posts and Telecommunications, Xi'an, Shaanxi 710121, China
- 2.
  Shaanxi Key Laboratory of Network Data Analysis and Intelligent Processing, Xi'an University of Posts and Telecommunications, Xi'an, Shaanxi 710121, China
- 3.
  Information and Navigation Institute, Air Force Engineering University, Xi'an, Shaanxi 710077, China
Fund Project: National Natural Science Foundation of China (61473309, 61703423)

More Information

**^*Corresponding author:** Hou Zhiqiang, E-mail: hou_qz@163.com

Received Date 24 April 2020

Revised Date 20 October 2020

Published Date 15 April 2021

摘要

摘要

基于Siamese网络的视觉跟踪算法是近年来视觉跟踪领域的一类重要方法，其在跟踪速度和精度上都具有良好的性能。但是大多数基于Siamese网络的跟踪算法依赖离线训练模型，缺乏对跟踪器的在线更新。针对这一问题，本文提出了一种基于在线学习的Siamese网络视觉跟踪算法。该算法采用双模板思想，将第一帧中的目标当作静态模板，在后续帧中使用高置信度更新策略获取动态模板；在线跟踪时，利用快速变换学习模型从双模板中学习目标的表观变化，同时根据当前帧的颜色直方图特征计算出搜索区域的目标似然概率图，与深度特征融合，进行背景抑制学习；最后，将双模板获取的响应图进行加权融合，获得最终跟踪结果。在OTB2015、TempleColor128和VOT数据集上的实验结果表明，本文算法的测试结果与近几年的多种主流算法相比均有所提高，在目标形变、相似背景干扰、快速运动等复杂场景下具有较好的跟踪性能。
- 视觉跟踪 /
- Siamese网络 /
- 双模板 /
- 快速变换学习模型
Abstract

Visual tracking algorithm based on a Siamese network is an important method in the field of visual tracking in recent years, and it has good performance in tracking speed and accuracy. However, most tracking algorithms based on the Siamese network rely on an off-line training model and lack of online update to tracker. In order to solve this problem, we propose an online learning-based visual tracking algorithm for Siamese networks. The algorithm adopts the idea of double template, treats the target in the first frame as a static template, and uses the high confidence update strategy to obtain the dynamic template in the subsequent frame; in online tracking, the fast transform learning model is used to learn the apparent changes of the target from the double template, and the target likelihood probability map of the search area is calculated according to the color histogram characteristics of the current frame, and the background suppression learning is carried out. Finally, the response map obtained by the dual templates is weighted, and the final prediction result is obtained. The experimental results on OTB2015, TempleColor128, and VOT datasets show that the test results of this algorithm are improved compared with the mainstream algorithms in recent years and have better tracking performance in target deformation, similar background interference, fast motion, and other scenarios.
- target tracking /
- Siamese network /
- dual templates /
- fast transformation learning model

Overview

Overview

Overview: Visual tracking is a fundamental challenging task in computer vision. Tracking predicts a target position in all subsequent frames given the initial frame information. It has been widely used in intelligent surveillance, unmanned driving, military detection, and other fields. In visual tracking, the target is usually faced with scale change, motion blur, target deformation, occlusion. At present, most trackers based on discriminative models include the correlation filters trackers which use hand-crafted features or CNNs and the Siamese network trackers. Visual tracking algorithm based on the Siamese network is an important method in the field of visual tracking in recent years, and it has good performance in tracking speed and accuracy. However, most tracking algorithms based on the Siamese network rely on off-line training model and lack of online update to tracker. Guo et al. proposed the DSiam algorithm, which constructed a dynamic Siamese network structure, including a fast transform learning model, and was able to learn the apparent changes and background suppression of the online target in the tracking phase. But it still has some disadvantages. Firstly, in the tracking stage, the rich information in the history frame is not used. Second, when background suppression, only a Gaussian weight graph is used in the search area, which cannot effectively highlight the target and suppress the background. In order to solve these problems, we propose an online learning-based visual tracking algorithm for Siamese networks. Main tasks as follows:

The algorithm adopts the idea of double template, treats the target in the first frame as a static template, and uses the high confidence update strategy to obtain the dynamic template in the subsequent frame.

In online tracking, the fast transform learning model is used to learn the apparent changes of the target from the double template, and the target likelihood probability map of the search area is calculated according to the color histogram characteristics of the current frame, and the background suppression learning is carried out.

Finally, the response map obtained by the dual templates is weighted and the final prediction result is obtained.

The experimental results on OTB2015, TempleColor128 and VOT datasets show that the test results of this algorithm are improved compared with the mainstream algorithms in recent years, and have better tracking performance in target deformation, similar background interference, fast motion, and other scenarios.

HTML全文

图 1 基于Siamese网络跟踪算法示意图

Figure 1. Schematic diagram of tracking algorithm based on Siamese network

下载: 全尺寸图片幻灯片

图 2 基于在线学习的视觉跟踪

Figure 2. Visual tracking based on online learning

下载: 全尺寸图片幻灯片

图 3 搜索区域及其目标似然概率图

Figure 3. Search area and its target likelihood probability graph

下载: 全尺寸图片幻灯片

图 4 5种算法部分跟踪结果对比

Figure 4. Comparison of partial tracking results of 5 algorithms

下载: 全尺寸图片幻灯片

图 5 OTB2015数据集上不同算法的成功率(a)和精度图(b)

Figure 5. Success rate (a) and accuracy (b) of different algorithms on OTB2015 data set

下载: 全尺寸图片幻灯片

图 6 TempleColor128数据集上不同算法的成功率(a)和精度图(b)

Figure 6. Success rate (a) and accuracy (b) of different algorithms on TempleColor128

下载: 全尺寸图片幻灯片

图 7 OTB2015数据集上加入不同模块算法的成功率(a)和精度图(b)

Figure 7. Success rate (a) and accuracy (b) of different modules are added into the algorithm on OTB2015 data set

下载: 全尺寸图片幻灯片

表 1 参数α、β的取值对成功率的影响(OTB2015)

Table 1. Influence of parameter values on success rate (OTB2015)

α	β
α	0.75	0.80	0.85	0.90	0.95
0.75	0.592	0.596	0.599	0.607	0.594
0.80	0.602	0.604	0.607	0.612	0.603
0.85	0.594	0.596	0.603	0.609	0.598
0.90	0.591	0.598	0.601	0.608	0.602
0.95	0.589	0.593	0.599	0.602	0.597

下载: 导出CSV

表 2 参数λ的取值对成功率的影响(OTB2015)

Table 2. Influence of parameter λ values on success rate (OTB2015)

λ	0.70	0.75	0.80	0.85	0.90
Success rate	0.609	0.612	0.610	0.606	0.604

下载: 导出CSV

表 4 不同属性下算法的跟踪成功率对比结果

Table 4. Comparsion results of tracking success of the algorithm under different attributes

Ours	0.598	0.615	0.589	0.583	0.572	0.601	0.582	0.571	0.603	0.578	0.618
SiamFC	0.551	0.556	0.579	0.564	0.507	0.570	0.552	0.515	0.545	0.470	0.582
DSiam	0.589	0.592	0.592	0.567	0.553	0.583	0.576	0.566	0.591	0.566	0.612
SiamTri	0.568	0.557	0.573	0.542	0.492	0.585	0.560	0.533	0.573	0.543	0.627
UDT	0.565	0.544	0.536	0.552	0.535	0.589	0.562	0.528	0.592	0.460	0.480
Staple	0.525	0.535	0.548	0.560	0.549	0.535	0.589	0.568	0.537	0.476	0.448
DSST	0.482	0.475	0.501	0.457	0.423	0.467	0.540	0.511	0.480	0.386	0.390
SRDCF	0.556	0.547	0.541	0.564	0.539	0.592	0.596	0.578	0.594	0.460	0.512

下载: 导出CSV

表 5 不同属性下算法的跟踪精确度对比结果

Table 5. Comparsion results of tracking accuracy of the algorithm under different attributes

Algorithm	SV	OPR	IPR	OCC	DEF	FM	IV	BC	MB	OV	LR
Ours	0.796	0.816	0.781	0.772	0.737	0.777	0.752	0.749	0.743	0.719	0.862
SiamFC	0.732	0.747	0.742	0.720	0.690	0.732	0.713	0.690	0.701	0.669	0.875
DSiam	0.784	0.796	0.770	0.751	0.726	0.754	0.740	0.741	0.731	0.708	0.854
SiamTri	0.752	0.752	0.739	0.714	0.718	0.761	0.713	0.695	0.714	0.723	0.859
UDT	0.743	0.756	0.753	0.732	0.703	0.740	0.724	0.701	0.715	0.677	0.852
Staple	0.731	0.725	0.759	0.726	0.732	0.708	0.737	0.722	0.701	0.668	0.682
DSST	0.654	0.650	0.501	0.457	0.543	0.584	0.690	0.681	0.480	0.478	0.581
SRDCF	0.739	0.571	0.742	0.735	0.726	0.758	0.781	0.761	0.757	0.597	0.744

下载: 导出CSV

表 6 VOT2015数据集上不同算法的精度和鲁棒性对比结果

Table 6. Evaluation on VOT2015 by the means of accuracy and robustness

	Ours	DSiam	HCF	SRDCF	Struck	Staple	SiamFC	LDP
Accuracy	0.65	0.59	0.45	0.56	0.47	0.53	0.52	0.51
Robustness	1.03	0.94	0.39	1.24	1.26	1.35	0.88	1.84
EAO	0.296	0.284	0.220	0.288	0.246	0.300	0.274	0.278

下载: 导出CSV

表 7 本文算法与不同算法的跟踪速度对比

Table 7. Comparing our method with different trackers in terms of tracking speed

	Ours	DSiam	SiamFC	SRDCF	MEEM	Struck	TADT	CFNet
Speed	29	45	58	5	10	20	33	41

下载: 导出CSV

参考文献(31)

参考文献

[1]	侯志强, 韩崇昭. 视觉跟踪技术综述[J]. 自动化学报, 2006, 32(4): 603-617. https://www.cnki.com.cn/Article/CJFDTOTAL-MOTO200604016.htm Hou Z Q, Han C Z. A survey of visual tracking[J]. Acta Automat Sin, 2006, 32(4): 603-617. https://www.cnki.com.cn/Article/CJFDTOTAL-MOTO200604016.htm
[2]	汤学猛, 陈志国, 傅毅. 基于核滤波器实时运动目标的抗遮挡再跟踪[J]. 光电工程, 2020, 47(1): 190279. doi: 10.12086/oee.2020.190279 Tang X M, Chen Z G, Fu Y. Anti-occlusion and re-tracking of real-time moving target based on kernelized correlation filter[J]. Opto-Electron Eng, 2020, 47(1): 190279. doi: 10.12086/oee.2020.190279
[3]	卢湖川, 李佩霞, 王栋. 目标跟踪算法综述[J]. 模式识别与人工智能, 2018, 31(1): 61-76. https://www.cnki.com.cn/Article/CJFDTOTAL-MSSB201801008.htm Lu H C, Li P X, Wang D. Visual object tracking: a survey[J]. Patt Recog Artif Intell, 2018, 31(1): 61-76. https://www.cnki.com.cn/Article/CJFDTOTAL-MSSB201801008.htm
[4]	赵春梅, 陈忠碧, 张建林. 基于卷积网络的目标跟踪应用研究[J]. 光电工程, 2020, 47(1): 180668. doi: 10.12086/oee.2020.180668 Zhao C M, Chen Z B, Zhang J L. Research on target tracking based on convolutional networks[J]. Opto-Electron Eng, 2020, 47(1): 180668. doi: 10.12086/oee.2020.180668
[5]	Bertinetto L, Valmadre J, Henriques J F, et al. Fully-convolutional Siamese networks for object tracking[C]//European Conference on Computer Vision, Cham, 2016: 850-865.
[6]	Dong X P, Shen J B. Triplet loss in Siamese network for object tracking[C]//Proceedings of the European Conference on Computer Vision (ECCV), Cham, 2018.
[7]	Wang Q, Gao J, Xing J L, et al. Dcfnet: Discriminant correlation filters network for visual tracking[Z]. arXiv: 1704.04057v1, 2017.
[8]	Li B, Yan J J, Wu W, et al. High performance visual tracking with siamese region proposal network[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018: 8971-8980.
[9]	Russakovsky O, Deng J, Su H, et al. Imagenet large scale visual recognition challenge[J]. Int J Comput Vis, 2015, 115(3): 211-252. doi: 10.1007/s11263-015-0816-y
[10]	Real E, Shlens J, Mazzocchi S, et al. Youtube-boundingboxes: A large high-precision human-annotated data set for object detection in video[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 2017: 5296-5305.
[11]	Guo Q, Feng W, Zhou C, et al. Learning dynamic Siamese network for visual object tracking[C]//Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 2017: 1763-1771.
[12]	Kuai Y L, Wen G J, Li D D. Masked and dynamic Siamese network for robust visual tracking[J]. Inf Sci, 2019, 503: 169-182. doi: 10.1016/j.ins.2019.07.004
[13]	Wu Y, Lim J, Yang M H. Object tracking benchmark[J]. IEEE Trans Patt Anal Mach Intellig, 2015, 37(9): 1834-1848.
[14]	Liang P P, Blasch E, Ling H B. Encoding color information for visual tracking: Algorithms and benchmark[J]. IEEE Trans Image Process, 2015, 24(12): 5630-5644. doi: 10.1109/TIP.2015.2482905
[15]	Kristan M, Matas J, Leonardis A, et al. The visual object tracking vot2015 challenge results[C]//Proceedings of the 2015 IEEE International Conference on Computer Vision Workshops, Santiago, Chile, 2015: 1-23.
[16]	Wang M M, Liu Y, Huang Z Y. Large margin object tracking with circulant feature maps[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 2017: 4021-4029.
[17]	侯志强, 陈立琳, 余旺盛, 等. 基于双模板Siamese网络的鲁棒视觉跟踪算法[J]. 电子与信息学报, 2019, 41(9): 2247-2255. https://www.cnki.com.cn/Article/CJFDTOTAL-DZYX201909030.htm Hou Z Q, Chen L L, Yu W S, et al. Robust visual tracking algorithm based on siamese network with dual templates[J]. J Electr Inf Technol, 2019, 41(9): 2247-2255. https://www.cnki.com.cn/Article/CJFDTOTAL-DZYX201909030.htm
[18]	Possegger H, Mauthner T, Bischof H. In defense of color-based model-free tracking[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 2015: 2113-2120.
[19]	谢瑜, 陈莹. 空间注意机制下的自适应目标跟踪[J]. 系统工程与电子技术, 2019, 41(9): 1945-1954. https://www.cnki.com.cn/Article/CJFDTOTAL-XTYD201909005.htm Xie Y, Chen Y. Adaptive object tracking based on spatial attention mechanism[J]. Syst Eng Electr, 2019, 41(9): 1945-1954. https://www.cnki.com.cn/Article/CJFDTOTAL-XTYD201909005.htm
[20]	Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks[C]//NIPS'12: Proceedings of the 25th International Conference on Neural Information Processing Systems, New York, NY, USA, 2012.
[21]	Song Y B, Ma C, Gong L J, et al. Crest: Convolutional residual learning for visual tracking[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 2017: 2555-2564.
[22]	Bertinetto L, Valmadre J, Golodetz S, et al. Staple: Complementary learners for real-time tracking[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 2016: 1401-1409.
[23]	Wang N, Song Y B, Ma C, et al. Unsupervised deep tracking[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2019: 1308-1317.
[24]	Danelljan M, Häger G, Khan F S, et al. Learning spatially regularized correlation filters for visual tracking[C]// Proceedings of the 2015 IEEE International Conference on Computer Vision, Santiago, Chile, 2015: 4310-4318.
[25]	Danelljan M, Häger G, Khan F, et al. Accurate scale estimation for robust visual tracking[C]//British Machine Vision Conference, Nottingham, 2014.
[26]	Zhang J M, Ma S G, Sclaroff S. MEEM: robust tracking via multiple experts using entropy minimization[C]//European Conference on Computer Vision, Cham, 2014: 188-203.
[27]	Valmadre J, Bertinetto L, Henriques J, et al. End-to-end representation learning for correlation filter based tracking[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 2017: 2805-2813.
[28]	Galoogahi H K, Fagg A, Lucey S. Learning background-aware correlation filters for visual tracking[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 2017: 1135-1143.
[29]	Zhang Z P, Peng H W. Deeper and wider Siamese networks for real-time visual tracking[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 2019: 4591-4600.
[30]	Li B, Wu W, Wang Q, et al. Siamrpn++: Evolution of siamese visual tracking with very deep networks[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2019: 4282-4291.
[31]	Li X, Ma C, Wu B Y, et al. Target-aware deep tracking[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 2019: 1369-1378.

施引文献

资源附件(0)

访问统计

访问统计

点击扫一扫

图(7)

表(6)

计量

文章访问数:
PDF下载数:
施引文献: 0

基于在线学习的Siamese网络视觉跟踪算法

作者简介:
张成煜(1995-)，男，硕士研究生，主要从事计算机视觉、视觉跟踪的研究。E-mail：chengy_z@163.com

**^*通讯作者:** 侯志强(1973-)，男，博士，教授，博士生导师，主要从事图像处理、计算机视觉和信息融合的研究. E-mail: hou_qz@163.com

Siamese network visual tracking algorithm based on online learning

**^*Corresponding author:** Hou Zhiqiang, E-mail: hou_qz@163.com

摘要

Abstract

Overview

参考文献

访问统计

计量

目录

作者须知

其他内容

条款和政策

基于在线学习的Siamese网络视觉跟踪算法

作者简介: 张成煜(1995-)，男，硕士研究生，主要从事计算机视觉、视觉跟踪的研究。E-mail：chengy_z@163.com

*通讯作者: 侯志强(1973-)，男，博士，教授，博士生导师，主要从事图像处理、计算机视觉和信息融合的研究. E-mail: hou_qz@163.com

Siamese network visual tracking algorithm based on online learning

*Corresponding author: Hou Zhiqiang, E-mail: hou_qz@163.com

摘要

Abstract

Overview

参考文献

访问统计

计量

出版历程

目录

作者须知

其他内容

条款和政策

作者简介:
张成煜(1995-)，男，硕士研究生，主要从事计算机视觉、视觉跟踪的研究。E-mail：chengy_z@163.com

**^*通讯作者:** 侯志强(1973-)，男，博士，教授，博士生导师，主要从事图像处理、计算机视觉和信息融合的研究. E-mail: hou_qz@163.com

**^*Corresponding author:** Hou Zhiqiang, E-mail: hou_qz@163.com