复杂场景下的红外目标检测

张汝榛,张建林,祁小平,等. 复杂场景下的红外目标检测[J]. 光电工程,2020,47(10):200314. doi: 10.12086/oee.2020.200314
引用本文: 张汝榛,张建林,祁小平,等. 复杂场景下的红外目标检测[J]. 光电工程,2020,47(10):200314. doi: 10.12086/oee.2020.200314
Zhang R Z, Zhang J L, Qi X P, et al. Infrared target detection and recognition in complex scene[J]. Opto-Electron Eng, 2020, 47(10): 200314. doi: 10.12086/oee.2020.200314
Citation: Zhang R Z, Zhang J L, Qi X P, et al. Infrared target detection and recognition in complex scene[J]. Opto-Electron Eng, 2020, 47(10): 200314. doi: 10.12086/oee.2020.200314

复杂场景下的红外目标检测

  • 基金项目:
    国家863计划资助项目(G158207)
详细信息
    作者简介:
    *通讯作者: 祁小平(1974-),男,副研究员,主要从事扩展目标高精度定位技术和图像处理算法的优化及实时实现的研究。E-mail:qixiaoping@163.com
  • 中图分类号: TP391.41;TN219

Infrared target detection and recognition in complex scene

  • Fund Project: Supported by National High Technology Research Development Program China (G158207)
More Information
  • 主流的目标检测网络在高质量RGB图像上的目标检测能力突出,但应用于分辨率低的红外图像上时目标检测性能则有比较明显的下降。为了提高复杂场景下的红外目标检测识别能力,本文采用了以下措施:第一、借鉴领域自适应的方法,采用合适的红外图像预处理手段,使红外图像更接近RGB图像,从而可以应用主流的目标检测网络进一步提高检测精度。第二、采用单阶段目标检测网络YOLOv3作为基础网络,并用GIOU损失函数代替原有的MSE损失函数。经实验验证,该算法在公开红外数据集FLIR上检测的准确率提升明显。第三、针对FLIR数据集存在的目标尺寸跨度大的问题,借鉴空间金字塔思想,加入SPP模块,丰富特征图的表达能力,扩大特征图的感受野。实验表明,所采用的方法可以进一步提高目标检测的精度。

  • Overview: In recent years, with the continuous development of computer vision, the ability of target detection based on deep learning has been significantly improved. However, most of the images used by mainstream target detection networks are RGB images, and there are few studies on the direction of infrared target detection. Moreover, the mainstream target detection network has a prominent target detection capability in high quality RGB images, but the target detection performance in infrared images with poor resolution is significantly reduced. Compared with infrared images, visible images have higher imaging resolution and rich target detail information. However, under certain weather conditions, the visible images cannot be obtained. Infrared imaging technology has the characteristics of long range, strong anti-interference ability, high measurement accuracy, not affected by weather, able to work day and night, and strong ability to penetrate smoke. Therefore, infrared imaging technology has been widely used once it was proposed. The demand for infrared target detection is also urgent.

    In order to improve the performance of infrared target detection in complex scenes, the following measures are adopted in this paper: First, referring to the field adaptive method, appropriate infrared image preprocessing means are adopted to make the infrared image closer to the RGB image, so as to further improve the detection accuracy by applying the mainstream target detection network. Secondly, mean square error (MSE), a loss function, regards the coordinate value of each point of BBox as an independent variable, which does not consider the integrity of the target frame, and ln-norms is sensitive to the scale of the object, so the algorithm is based on the single-stage target detection network YOLOv3 and replaces the original MSE loss function with GIOU loss function. It is verified by experiments that the detection accuracy on FLIR, an open infrared data set, is significantly improved, and the problem of inaccurate location in the original network is effectively improved. Thirdly, in view of the problem of large span of target size in the FLIR data set, the SPP module is added to enrich the expression ability of feature map and expand the receptive field of feature map by referring to the idea of space pyramid. The experimental results show that the network detection error rate decreases after the addition of SPP module, and after overcoming the original deficiency of the YOLOv3, the target accuracy of detection can be further improved compared with the modification of GIOU loss function only.

  • 加载中
  • 图 1  来自FLIR数据集的不同红外图片(每行)。(a)原图;(b)倒置;(c)直方图均衡;(d)去噪+图像锐化

    Figure 1.  Different infrared images (per line) from the FLIR dataset. (a) Original image; (b) Inversion; (c) Histogram equalization; (d) Denoising + image sharpening

    图 2  修改后的YOLOv3网络结构图

    Figure 2.  Modified YOLOv3 network structure

    图 3  SPP模块

    Figure 3.  SPP module

    图 4  算法流程图

    Figure 4.  Algorithm flow chart

    图 5  (a) 不同网络的所有类别检测速度和精度的结果;(b)不同网络的汽车检测速度和精度的结果;(c)不同网络的人检测速度和精度的结果;(d)不同网络的自行车检测速度和精度的结果

    Figure 5.  (a) The results of detection speed and accuracy of all categories of different networks; (b) The results of vehicle detection speed and accuracy of different networks; (c) The results of human detection speed and accuracy of different networks; (d) The results of bicycle detection speed and accuracy on different networks

    图 6  (a) YOLOv3网络检测结果;(b)真实值;(c)使用GIOU损失函数的YOLOv3网络检测结果;(d)使用GIOU损失函数并添加SPP模块的YOLOv3网络的检测结果

    Figure 6.  (a) The network detection result of YOLOv3; (b) Ground truth; (c) The YOLOv3 network detection results using GIOU loss function; (d) Detection results for YOLOv3 network that use GIOU loss function and SPP modules

    表 1  使用ImageNet和MS COCO数据集权重训练FLIR红外数据集

    Table 1.  Use ImageNet and MS COCO data set to train FLIR infrared data set

    Dataset mAP/%
    ImageNet 49.86
    MS COCO 58.02
    下载: 导出CSV

    表 2  不同的预处理方法输入到YOLOv3网络进行训练的检测结果

    Table 2.  Different pretreatment methods were input into the YOLOv3 network for training

    Pretreatment methods mAP/%
    Original image 58.02
    Inversion 60.71
    Histogram equalization 57.26
    De-noising + sharpen 57.42
    下载: 导出CSV

    表 3  不同框架对于FLIR数据集的检测结果。Faster R-CNN IOU阈值为0.3,YOLOv3 IOU阈值为0.6

    Table 3.  Detection results of FILR dataset by different frameworks. The Faster R-CNN IOU threshold is 0.3, and the YOLOv3 IOU threshold is 0.6

    Detention method AP/% mAP/% Test time/s
    car person bicycle
    Faster R-CNN(VGG16) 74.15 62.14 43.58 59.96 0.070
    Faster R-CNN(Res101) 77.96 65.72 45.86 63.18 0.163
    YOLOv3 77.69 57.47 39.74 58.02 0.026
    Ours(YOLOv3+GIOU) 79.3 68.10 31.90 59.70 0.042
    Ours(YOLOv3+GIOU+SPP) 81.90 72.60 49.00 66.80 0.039
    下载: 导出CSV
  • [1]

    Hou Y L, Song Y Y, Hao X L, et al. Multispectral pedestrian detection based on deep convolutional neural networks[C]//2017 IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC), 2018.

    [2]

    朱大炜.基于深度学习的红外图像飞机目标检测方法[D].西安: 西安电子科技大学, 2018.

    Zhu D W. Infrared image plane target detection method based on deep learning[D]. Xi'an: Xidian University, 2018.

    [3]

    Herrmann C, Ruf M, Beyerer J. CNN-based thermal infrared person detection by domain adaptation[J]. Proceedings of SPIE, 2018, 10643: 1064308. http://www.researchgate.net/publication/324935012_CNN-based_thermal_infrared_person_detection_by_domain_adaptation

    [4]

    侯志强, 刘晓义, 余旺盛, 等.基于双阈值-非极大值抑制的Faster R-CNN改进算法[J].光电工程, 2019, 46(12): 190159. doi: 10.12086/oee.2019.190159

    Hou Z Q, Liu X Y, Yu W S, et al. Improved algorithm of faster R-CNN based on double threshold-non-maximum suppression[J]. Opto-Electronic Engineering, 2019, 46(12): 190159. doi: 10.12086/oee.2019.190159

    [5]

    Liu W, Anguelov D, Erhan D, et al. SSD: single shot MultiBox detector[C]//Proceedings of the 14th European Conference on Computer Vision, 2016, 9905: 21–37.

    [6]

    Fu C Y, Liu W, Ranga A, et al. DSSD: deconvolutional single shot detector[Z]. arXiv: 1701.06659[cs.CV], 2017.

    [7]

    Redmon J, Farhadi A. YOLOv3: an incremental improvement[Z]. arXiv: 1804.02767[cs.CV], 2018.

    [8]

    Bochkovskiy A, Wang C Y, Liao H Y M. YOLOv4: optimal speed and accuracy of object detection[Z]. arXiv: 2004.10934[cs.CV], 2020.

    [9]

    金瑶, 张锐, 尹东.城市道路视频中小像素目标检测[J].光电工程, 2019, 46(9): 190053. doi: 10.12086/oee.2019.190053

    Jin Y, Zhang R, Yin D. Object detection for small pixel in urban roads videos[J]. Opto-Electronic Engineering, 2019, 46(9): 190053. doi: 10.12086/oee.2019.190053

    [10]

    Li Z M, Peng C, Yu G, et al. DetNet: a backbone network for object detection[Z]. arXiv: 1804.06215[cs.CV], 2018.

    [11]

    Liu S T, Huang D, Wang Y H. Receptive field block net for accurate and fast object detection[Z]. arXiv: 1711.07767[cs.CV], 2017.

    [12]

    赵春梅, 陈忠碧, 张建林.基于深度学习的飞机目标跟踪应用研究[J].光电工程, 2019, 46(9): 180261. doi: 10.12086/oee.2019.180261

    Zhao C M, Chen Z B, Zhang J L. Application of aircraft target tracking based on deep learning[J]. Opto-Electronic Engineering, 2019, 46(9): 180261. doi: 10.12086/oee.2019.180261

    [13]

    石超, 陈恩庆, 齐林.红外视频中的舰船检测[J].光电工程, 2018, 45(6): 170748. doi: 10.12086/oee.2018.170748

    Shi C, Chen E Q, Qi L. Ship detection from infrared video[J]. Opto-Electronic Engineering, 2018, 45(6): 170748. doi: 10.12086/oee.2018.170748

    [14]

    Yu J H, Jiang Y N, Wang Z Y, et al. UnitBox: An advanced object detection network[C]//Proceedings of the 24th ACM International Conference on Multimedia, 2016.

    [15]

    Rezatofighi H, Tsoi N, Gwak J Y, et al. Generalized intersection over union: a metric and a loss for bounding box regression[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020.

  • 加载中

(6)

(3)

计量
  • 文章访问数: 
  • PDF下载数: 
  • 施引文献:  0
出版历程
收稿日期:  2020-08-20
修回日期:  2020-09-22
刊出日期:  2020-10-15

目录

/

返回文章
返回