基于在线学习的Siamese网络视觉跟踪算法

张成煜,侯志强,蒲磊,等. 基于在线学习的Siamese网络视觉跟踪算法[J]. 光电工程,2021,48(4):200140. doi: 10.12086/oee.2021.200140
引用本文: 张成煜,侯志强,蒲磊,等. 基于在线学习的Siamese网络视觉跟踪算法[J]. 光电工程,2021,48(4):200140. doi: 10.12086/oee.2021.200140
Zhang C Y, Hou Z Q, Pu L, et al. Siamese network visual tracking algorithm based on online learning[J]. Opto-Electron Eng, 2021, 48(4): 200140. doi: 10.12086/oee.2021.200140
Citation: Zhang C Y, Hou Z Q, Pu L, et al. Siamese network visual tracking algorithm based on online learning[J]. Opto-Electron Eng, 2021, 48(4): 200140. doi: 10.12086/oee.2021.200140

基于在线学习的Siamese网络视觉跟踪算法

  • 基金项目:
    国家自然科学基金资助项目(61473309,61703423)
详细信息
    作者简介:
    *通讯作者: 侯志强(1973-),男,博士,教授,博士生导师,主要从事图像处理、计算机视觉和信息融合的研究. E-mail: hou_qz@163.com
  • 中图分类号: TP391

Siamese network visual tracking algorithm based on online learning

  • Fund Project: National Natural Science Foundation of China (61473309, 61703423)
More Information
  • 基于Siamese网络的视觉跟踪算法是近年来视觉跟踪领域的一类重要方法,其在跟踪速度和精度上都具有良好的性能。但是大多数基于Siamese网络的跟踪算法依赖离线训练模型,缺乏对跟踪器的在线更新。针对这一问题,本文提出了一种基于在线学习的Siamese网络视觉跟踪算法。该算法采用双模板思想,将第一帧中的目标当作静态模板,在后续帧中使用高置信度更新策略获取动态模板;在线跟踪时,利用快速变换学习模型从双模板中学习目标的表观变化,同时根据当前帧的颜色直方图特征计算出搜索区域的目标似然概率图,与深度特征融合,进行背景抑制学习;最后,将双模板获取的响应图进行加权融合,获得最终跟踪结果。在OTB2015、TempleColor128和VOT数据集上的实验结果表明,本文算法的测试结果与近几年的多种主流算法相比均有所提高,在目标形变、相似背景干扰、快速运动等复杂场景下具有较好的跟踪性能。

  • Overview: Visual tracking is a fundamental challenging task in computer vision. Tracking predicts a target position in all subsequent frames given the initial frame information. It has been widely used in intelligent surveillance, unmanned driving, military detection, and other fields. In visual tracking, the target is usually faced with scale change, motion blur, target deformation, occlusion. At present, most trackers based on discriminative models include the correlation filters trackers which use hand-crafted features or CNNs and the Siamese network trackers. Visual tracking algorithm based on the Siamese network is an important method in the field of visual tracking in recent years, and it has good performance in tracking speed and accuracy. However, most tracking algorithms based on the Siamese network rely on off-line training model and lack of online update to tracker. Guo et al. proposed the DSiam algorithm, which constructed a dynamic Siamese network structure, including a fast transform learning model, and was able to learn the apparent changes and background suppression of the online target in the tracking phase. But it still has some disadvantages. Firstly, in the tracking stage, the rich information in the history frame is not used. Second, when background suppression, only a Gaussian weight graph is used in the search area, which cannot effectively highlight the target and suppress the background. In order to solve these problems, we propose an online learning-based visual tracking algorithm for Siamese networks. Main tasks as follows:

    The algorithm adopts the idea of double template, treats the target in the first frame as a static template, and uses the high confidence update strategy to obtain the dynamic template in the subsequent frame.

    In online tracking, the fast transform learning model is used to learn the apparent changes of the target from the double template, and the target likelihood probability map of the search area is calculated according to the color histogram characteristics of the current frame, and the background suppression learning is carried out.

    Finally, the response map obtained by the dual templates is weighted and the final prediction result is obtained.

    The experimental results on OTB2015, TempleColor128 and VOT datasets show that the test results of this algorithm are improved compared with the mainstream algorithms in recent years, and have better tracking performance in target deformation, similar background interference, fast motion, and other scenarios.

  • 加载中
  • 图 1  基于Siamese网络跟踪算法示意图

    Figure 1.  Schematic diagram of tracking algorithm based on Siamese network

    图 2  基于在线学习的视觉跟踪

    Figure 2.  Visual tracking based on online learning

    图 3  搜索区域及其目标似然概率图

    Figure 3.  Search area and its target likelihood probability graph

    图 4  5种算法部分跟踪结果对比

    Figure 4.  Comparison of partial tracking results of 5 algorithms

    图 5  OTB2015数据集上不同算法的成功率(a)和精度图(b)

    Figure 5.  Success rate (a) and accuracy (b) of different algorithms on OTB2015 data set

    图 6  TempleColor128数据集上不同算法的成功率(a)和精度图(b)

    Figure 6.  Success rate (a) and accuracy (b) of different algorithms on TempleColor128

    图 7  OTB2015数据集上加入不同模块算法的成功率(a)和精度图(b)

    Figure 7.  Success rate (a) and accuracy (b) of different modules are added into the algorithm on OTB2015 data set

    表 1  参数αβ的取值对成功率的影响(OTB2015)

    Table 1.  Influence of parameter values on success rate (OTB2015)

    α β
    0.75 0.80 0.85 0.90 0.95
    0.75 0.592 0.596 0.599 0.607 0.594
    0.80 0.602 0.604 0.607 0.612 0.603
    0.85 0.594 0.596 0.603 0.609 0.598
    0.90 0.591 0.598 0.601 0.608 0.602
    0.95 0.589 0.593 0.599 0.602 0.597
    下载: 导出CSV

    表 2  参数λ的取值对成功率的影响(OTB2015)

    Table 2.  Influence of parameter λ values on success rate (OTB2015)

    λ 0.70 0.75 0.80 0.85 0.90
    Success rate 0.609 0.612 0.610 0.606 0.604
    下载: 导出CSV

    表 4  不同属性下算法的跟踪成功率对比结果

    Table 4.  Comparsion results of tracking success of the algorithm under different attributes

    Ours 0.598 0.615 0.589 0.583 0.572 0.601 0.582 0.571 0.603 0.578 0.618
    SiamFC 0.551 0.556 0.579 0.564 0.507 0.570 0.552 0.515 0.545 0.470 0.582
    DSiam 0.589 0.592 0.592 0.567 0.553 0.583 0.576 0.566 0.591 0.566 0.612
    SiamTri 0.568 0.557 0.573 0.542 0.492 0.585 0.560 0.533 0.573 0.543 0.627
    UDT 0.565 0.544 0.536 0.552 0.535 0.589 0.562 0.528 0.592 0.460 0.480
    Staple 0.525 0.535 0.548 0.560 0.549 0.535 0.589 0.568 0.537 0.476 0.448
    DSST 0.482 0.475 0.501 0.457 0.423 0.467 0.540 0.511 0.480 0.386 0.390
    SRDCF 0.556 0.547 0.541 0.564 0.539 0.592 0.596 0.578 0.594 0.460 0.512
    下载: 导出CSV

    表 5  不同属性下算法的跟踪精确度对比结果

    Table 5.  Comparsion results of tracking accuracy of the algorithm under different attributes

    Algorithm SV OPR IPR OCC DEF FM IV BC MB OV LR
    Ours 0.796 0.816 0.781 0.772 0.737 0.777 0.752 0.749 0.743 0.719 0.862
    SiamFC 0.732 0.747 0.742 0.720 0.690 0.732 0.713 0.690 0.701 0.669 0.875
    DSiam 0.784 0.796 0.770 0.751 0.726 0.754 0.740 0.741 0.731 0.708 0.854
    SiamTri 0.752 0.752 0.739 0.714 0.718 0.761 0.713 0.695 0.714 0.723 0.859
    UDT 0.743 0.756 0.753 0.732 0.703 0.740 0.724 0.701 0.715 0.677 0.852
    Staple 0.731 0.725 0.759 0.726 0.732 0.708 0.737 0.722 0.701 0.668 0.682
    DSST 0.654 0.650 0.501 0.457 0.543 0.584 0.690 0.681 0.480 0.478 0.581
    SRDCF 0.739 0.571 0.742 0.735 0.726 0.758 0.781 0.761 0.757 0.597 0.744
    下载: 导出CSV

    表 6  VOT2015数据集上不同算法的精度和鲁棒性对比结果

    Table 6.  Evaluation on VOT2015 by the means of accuracy and robustness

    Ours DSiam HCF SRDCF Struck Staple SiamFC LDP
    Accuracy 0.65 0.59 0.45 0.56 0.47 0.53 0.52 0.51
    Robustness 1.03 0.94 0.39 1.24 1.26 1.35 0.88 1.84
    EAO 0.296 0.284 0.220 0.288 0.246 0.300 0.274 0.278
    下载: 导出CSV

    表 7  本文算法与不同算法的跟踪速度对比

    Table 7.  Comparing our method with different trackers in terms of tracking speed

    Ours DSiam SiamFC SRDCF MEEM Struck TADT CFNet
    Speed 29 45 58 5 10 20 33 41
    下载: 导出CSV
  • [1]

    侯志强, 韩崇昭. 视觉跟踪技术综述[J]. 自动化学报, 2006, 32(4): 603-617. https://www.cnki.com.cn/Article/CJFDTOTAL-MOTO200604016.htm

    Hou Z Q, Han C Z. A survey of visual tracking[J]. Acta Automat Sin, 2006, 32(4): 603-617. https://www.cnki.com.cn/Article/CJFDTOTAL-MOTO200604016.htm

    [2]

    汤学猛, 陈志国, 傅毅. 基于核滤波器实时运动目标的抗遮挡再跟踪[J]. 光电工程, 2020, 47(1): 190279. doi: 10.12086/oee.2020.190279

    Tang X M, Chen Z G, Fu Y. Anti-occlusion and re-tracking of real-time moving target based on kernelized correlation filter[J]. Opto-Electron Eng, 2020, 47(1): 190279. doi: 10.12086/oee.2020.190279

    [3]

    卢湖川, 李佩霞, 王栋. 目标跟踪算法综述[J]. 模式识别与人工智能, 2018, 31(1): 61-76. https://www.cnki.com.cn/Article/CJFDTOTAL-MSSB201801008.htm

    Lu H C, Li P X, Wang D. Visual object tracking: a survey[J]. Patt Recog Artif Intell, 2018, 31(1): 61-76. https://www.cnki.com.cn/Article/CJFDTOTAL-MSSB201801008.htm

    [4]

    赵春梅, 陈忠碧, 张建林. 基于卷积网络的目标跟踪应用研究[J]. 光电工程, 2020, 47(1): 180668. doi: 10.12086/oee.2020.180668

    Zhao C M, Chen Z B, Zhang J L. Research on target tracking based on convolutional networks[J]. Opto-Electron Eng, 2020, 47(1): 180668. doi: 10.12086/oee.2020.180668

    [5]

    Bertinetto L, Valmadre J, Henriques J F, et al. Fully-convolutional Siamese networks for object tracking[C]//European Conference on Computer Vision, Cham, 2016: 850-865.

    [6]

    Dong X P, Shen J B. Triplet loss in Siamese network for object tracking[C]//Proceedings of the European Conference on Computer Vision (ECCV), Cham, 2018.

    [7]

    Wang Q, Gao J, Xing J L, et al. Dcfnet: Discriminant correlation filters network for visual tracking[Z]. arXiv: 1704.04057v1, 2017.

    [8]

    Li B, Yan J J, Wu W, et al. High performance visual tracking with siamese region proposal network[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018: 8971-8980.

    [9]

    Russakovsky O, Deng J, Su H, et al. Imagenet large scale visual recognition challenge[J]. Int J Comput Vis, 2015, 115(3): 211-252. doi: 10.1007/s11263-015-0816-y

    [10]

    Real E, Shlens J, Mazzocchi S, et al. Youtube-boundingboxes: A large high-precision human-annotated data set for object detection in video[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 2017: 5296-5305.

    [11]

    Guo Q, Feng W, Zhou C, et al. Learning dynamic Siamese network for visual object tracking[C]//Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 2017: 1763-1771.

    [12]

    Kuai Y L, Wen G J, Li D D. Masked and dynamic Siamese network for robust visual tracking[J]. Inf Sci, 2019, 503: 169-182. doi: 10.1016/j.ins.2019.07.004

    [13]

    Wu Y, Lim J, Yang M H. Object tracking benchmark[J]. IEEE Trans Patt Anal Mach Intellig, 2015, 37(9): 1834-1848.

    [14]

    Liang P P, Blasch E, Ling H B. Encoding color information for visual tracking: Algorithms and benchmark[J]. IEEE Trans Image Process, 2015, 24(12): 5630-5644. doi: 10.1109/TIP.2015.2482905

    [15]

    Kristan M, Matas J, Leonardis A, et al. The visual object tracking vot2015 challenge results[C]//Proceedings of the 2015 IEEE International Conference on Computer Vision Workshops, Santiago, Chile, 2015: 1-23.

    [16]

    Wang M M, Liu Y, Huang Z Y. Large margin object tracking with circulant feature maps[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 2017: 4021-4029.

    [17]

    侯志强, 陈立琳, 余旺盛, 等. 基于双模板Siamese网络的鲁棒视觉跟踪算法[J]. 电子与信息学报, 2019, 41(9): 2247-2255. https://www.cnki.com.cn/Article/CJFDTOTAL-DZYX201909030.htm

    Hou Z Q, Chen L L, Yu W S, et al. Robust visual tracking algorithm based on siamese network with dual templates[J]. J Electr Inf Technol, 2019, 41(9): 2247-2255. https://www.cnki.com.cn/Article/CJFDTOTAL-DZYX201909030.htm

    [18]

    Possegger H, Mauthner T, Bischof H. In defense of color-based model-free tracking[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 2015: 2113-2120.

    [19]

    谢瑜, 陈莹. 空间注意机制下的自适应目标跟踪[J]. 系统工程与电子技术, 2019, 41(9): 1945-1954. https://www.cnki.com.cn/Article/CJFDTOTAL-XTYD201909005.htm

    Xie Y, Chen Y. Adaptive object tracking based on spatial attention mechanism[J]. Syst Eng Electr, 2019, 41(9): 1945-1954. https://www.cnki.com.cn/Article/CJFDTOTAL-XTYD201909005.htm

    [20]

    Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks[C]//NIPS'12: Proceedings of the 25th International Conference on Neural Information Processing Systems, New York, NY, USA, 2012.

    [21]

    Song Y B, Ma C, Gong L J, et al. Crest: Convolutional residual learning for visual tracking[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 2017: 2555-2564.

    [22]

    Bertinetto L, Valmadre J, Golodetz S, et al. Staple: Complementary learners for real-time tracking[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 2016: 1401-1409.

    [23]

    Wang N, Song Y B, Ma C, et al. Unsupervised deep tracking[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2019: 1308-1317.

    [24]

    Danelljan M, Häger G, Khan F S, et al. Learning spatially regularized correlation filters for visual tracking[C]// Proceedings of the 2015 IEEE International Conference on Computer Vision, Santiago, Chile, 2015: 4310-4318.

    [25]

    Danelljan M, Häger G, Khan F, et al. Accurate scale estimation for robust visual tracking[C]//British Machine Vision Conference, Nottingham, 2014.

    [26]

    Zhang J M, Ma S G, Sclaroff S. MEEM: robust tracking via multiple experts using entropy minimization[C]//European Conference on Computer Vision, Cham, 2014: 188-203.

    [27]

    Valmadre J, Bertinetto L, Henriques J, et al. End-to-end representation learning for correlation filter based tracking[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 2017: 2805-2813.

    [28]

    Galoogahi H K, Fagg A, Lucey S. Learning background-aware correlation filters for visual tracking[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 2017: 1135-1143.

    [29]

    Zhang Z P, Peng H W. Deeper and wider Siamese networks for real-time visual tracking[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 2019: 4591-4600.

    [30]

    Li B, Wu W, Wang Q, et al. Siamrpn++: Evolution of siamese visual tracking with very deep networks[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2019: 4282-4291.

    [31]

    Li X, Ma C, Wu B Y, et al. Target-aware deep tracking[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 2019: 1369-1378.

  • 加载中

(7)

(6)

计量
  • 文章访问数:  5068
  • PDF下载数:  1539
  • 施引文献:  0
出版历程
收稿日期:  2020-04-24
修回日期:  2020-10-20
刊出日期:  2021-04-15

目录

/

返回文章
返回