渐进式多粒度ResNet车型识别网络

徐胜军,荆扬,李海涛,等. 渐进式多粒度ResNet车型识别网络[J]. 光电工程,2023,50(7): 230052. doi: 10.12086/oee.2023.230052
引用本文: 徐胜军,荆扬,李海涛,等. 渐进式多粒度ResNet车型识别网络[J]. 光电工程,2023,50(7): 230052. doi: 10.12086/oee.2023.230052
Xu S J, Jing Y, Li H T, et al. Progressive multi-granularity ResNet vehicle recognition network[J]. Opto-Electron Eng, 2023, 50(7): 230052. doi: 10.12086/oee.2023.230052
Citation: Xu S J, Jing Y, Li H T, et al. Progressive multi-granularity ResNet vehicle recognition network[J]. Opto-Electron Eng, 2023, 50(7): 230052. doi: 10.12086/oee.2023.230052

渐进式多粒度ResNet车型识别网络

  • 基金项目:
    国家自然科学基金资助项目(51678470, 61803293);陕西省教育厅专项科研项目资助(18JK0477, 2017JM6106);陕西省自然科学基础研究计划资助项目(2020JM-472, 2020JM-473, 2019JQ-760);西安建筑科技大学基础研究基础资助项目(JC1703, JC1706);陕西省科技厅社发攻关项目(2021SF-429)
详细信息
    作者简介:
    *通讯作者: 荆扬,jingyang0525@xauat.edu.cn
  • 中图分类号: TP391.4

Progressive multi-granularity ResNet vehicle recognition network

  • Fund Project: Project supported by National Natural Science Fundation of China (51678470, 61803293), Shaanxi Provincial Department of Education Special Fund (18JK0477, 2017JM6106), Shaanxi Province Natural Science Basic Research Fund (2020JM-472, 2020JM-473, 2019JQ-760), Basic Funding Project of Basic Research of Xi'an University of Architecture and Technology (JC1703, JC1706), Shaanxi Provincial Department of Science and Technology issued research projects (2021SF-429)
More Information
  • 针对车辆因姿态、视角等成像差异造成车型难以识别问题,提出一种基于渐进式多粒度ResNet车型识别网络。首先,以ResNet网络作为主干网络,提出渐进式多粒度局部卷积模块,对不同粒度级别的车辆图像进行局部卷积操作,使网络重构时能够关注到不同粒度级别的车辆局部特征;其次,对多粒度局部特征图利用随机通道丢弃模块进行随机通道丢弃,抑制网络对车辆显著性区域特征的注意力,提高非显著性特征的关注度;最后,提出一种渐进式多粒度训练模块,在每个训练步骤中增加分类损失,引导网络提取更具辨别力和多样性的车辆多尺度特征。实验结果表明,在Stanford cars数据集、Compcars网络数据集和真实场景下的车型数据集VMRURS上,所提网络的识别准确率分别达到了95.7%、98.8%和97.4%,和对比网络相比,所提网络不仅具有较高的识别准确率,而且具有更好的鲁棒性。

  • Overview: Model recognition aims to identify specific information such as the brand, model, and year of the vehicle, which can help verify the accuracy of tracking vehicle information. There are two research strategies for model recognition tasks. The strategy of strong supervision and learning involves utilizing image-level labeling information as well as additional bounding boxes in the model, component information, etc. Based on the strategy of weak supervision and learning, only the image-level label can be completely classified by fine particle size models. Most classification methods for weak supervision and learning adopt strategies such as attention mechanisms, dual-linear convolutional neural networks, and measurement learning. Pay more attention to the significant particle size of the vehicle's grid, tire tires, and other large granularity, and ignore the characteristics of small-size vehicle characteristics with distinguishing power such as car logo and door handles. Aiming at the difficulty of the vehicle due to the imaging differences such as posture and perspective, it is difficult to identify the model and propose a variety of multi-granular ResNet model recognition networks. First of all, using the ResNet network as the main network, propose a gradual multi-granular local convolution module to perform local convolution operations on vehicle images of different particle sizes, so that the network can be paid attention to the local characteristics of different particle-level vehicles when restructuring. Use the random channel discarding module to discard the multi-scale local feature map for random channel discarding, inhibit the network's attention to the characteristics of the vehicle's significant regional characteristics, and increase the attention of non-significant characteristics. Each training step is added to the classification loss. By dividing the network training process into different stages, the network can effectively integrate the multi-size features of the vehicle withdrawal, and guide the network extraction of multi-scale characteristics of vehicles with more discerning and diverse vehicles. The experimental results show that on the Stanford Cars dataset, the Compcars network dataset, and the model data set in the real scene, the accuracy of the network recognition accuracy has reached 95.7%, 98.8%, and 97.4%, respectively. Compared with the comparison network, the proposed network not only has the accuracy of recognition but also has better robustness. It has achieved very outstanding results in real scenes such as low light intensity and deformation of vehicles. The effectiveness of the model recognition on the road.

  • 加载中
  • 图 1  网络整体结构

    Figure 1.  Overall structure of the proposed network

    图 2  渐进式多粒度局部卷积模块(PLCB)

    Figure 2.  Progressive multi-granularity Local Convolution Block

    图 3  随机通道丢弃模块(RCDB)结构图

    Figure 3.  Random channel drop block schematic diagram

    图 4  渐进式多粒度训练模块(PMTB)示意图

    Figure 4.  Progressive multi-granularity training block schematic diagram

    图 5  Top1/%变化曲线图。(a) Stanford-cars 上 β 值对 RCDB 的影响;(b) Compcars 上 β 值对 RCDB 的影响;(c) VMRURS 上 β 值对 RCDB 的影响

    Figure 5.  Top1/% curve of change. (a) Effect of β values on RCDB on Stanford-cars; (b) Effect of β values on RCDB on Compcars; (c) Effect of β values on RCDB on VMRURS

    图 6  网络训练和验证过程

    Figure 6.  Network training and testing process

    图 7  各个阶段的车型识别可视化对比

    Figure 7.  Visual comparison of vehicle recognition in each stage

    图 8  加入各模块后可视化对比

    Figure 8.  Visual comparison of after adding each module

    图 9  不同网络车型识别可视化对比

    Figure 9.  Visual comparison of different network vehicle recognition

    表 1  渐进式多粒度训练步骤

    Table 1.  Progressive multi-granularity training steps

    渐进式多粒度训练过程
    输入:训练数据集D,训练数据的批次为x,标签样本为yP代表多粒度渐进式网络学习,${L_{\rm{CE}}}$代表交叉熵损失(cross entropy loss, CE)
    ${\text{For }epoch } \in \left[ {0,epochs} \right]{\text{do} }$
    $ {\text{For }}b \in \left[ {0,batchs} \right]{\text{do}} $
    $x,y \Leftarrow batch \; {{ b \; {\rm{of}} \; D} }$
    ${\text{For } }l \in [L - S + 1,L]{\text{ do} }$
    ${y^l} \Leftarrow H_{ {\rm{class} } }^l[H_{ {\rm{Conv} } }^l({F^l}(P(x,n)))]$
    ${L_l} \Leftarrow {L_{{\rm{CE}}} }({y^l},y)$
    Backpropagation ${L_l}$
    End for
    ${y^{ {\rm{Concat} } } } = H_{ {\rm{class} } }^{ {\rm{Concat} } }\left\{ { {\rm{Concat} }[{V^{(L - S + 1)} }, \ldots ,{V^L}]} \right\}$
    ${L_{ {\rm{Concat} } } } \Leftarrow {L_{{\rm{CE}}} }({y^{ {\rm{Concat} } } },y)$
    Backpropagation ${L_{ {\rm{Concat} } } }$
    End for
    End for
    下载: 导出CSV

    表 2  各个阶段PLCB切分大小的比较

    Table 2.  Comparison of PLCB split size at each stage

    Stage1Stage2Stage3Stage4Accuracy/%
    111193.5
    222293.9
    444494.2
    888894.0
    1684293.6
    842194.5
    421193.9
    下载: 导出CSV

    表 3  RCDB模块加入不同层后识别效果消融实验

    Table 3.  Ablation experiment of recognition effect after adding different layers to the RCDB module

    ResNet50Layer1Layer2Layer3Layer4Accuracy/%
    91.5
    92.3
    92.9
    93.0
    92.6
    93.2
    下载: 导出CSV

    表 4  不同模块依次加入网络中的实验效果

    Table 4.  Different modules are added to the network

    BaselinePLCBRCDBAccuracy/%
    91.5
    94.8
    93.2
    95.7
    下载: 导出CSV

    表 5  不同网络车型识别准确率比较

    Table 5.  Comparison of recognition accuracy of different network models

    MethodsBackboneStanford-cars/%Compcars/%VMRURS/%Speed/(f/s)Params/MFLOPs/G
    基线[25]ResNet5091.594.187.14.1523.5033.05
    FBSD[31]ResNet5094.496.892.31.7346.8253.11
    LIO[32]ResNet5094.596.894.23.6024.5733.06
    DCL[33]ResNet5094.596.794.73.4624.9133.06
    Cross-X[34]ResNet5094.697.094.63.8825.5638.86
    CAL[17]ResNet5095.598.096.43.7233.7333.08
    WS-DAN[18]ResNet5094.597.195.64.0233.2433.08
    PMG[26]ResNet5095.197.895.72.9445.1269.82
    CN-CNN[35]ResNet5094.997.694.91.9242.3147.65
    OursResNet5095.798.897.42.9740.6469.61
    下载: 导出CSV
  • [1]

    Bay H, Tuytelaars T, Van Gool L. SURF: speeded up robust features[C]//Proceedings of the 9th European Conference on Computer Vision, 2006: 404–417. https://doi.org/10.1007/11744023_32.

    [2]

    Csurka G, Dance C R, Fan L X, et al. Visual categorization with bags of keypoints[C]//Workshop on Statistical Learning in Computer Vision, Prague, 2004.

    [3]

    De Sousa Matos F M, De Souza R M C R. An image vehicle classification method based on edge and PCA applied to blocks[C]//International Conference on Systems, Man, and Cybernetics, 2012: 1688–1693. https://doi.org/10.1109/ICSMC.2012.6377980.

    [4]

    Behley J, Steinhage V, Cremers A B. Laser-based segment classification using a mixture of bag-of-words[C]//2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2013: 4195–4200. https://doi.org/10.1109/IROS.2013.6696957.

    [5]

    Liao L, Hu R M, Xiao J, et al. Exploiting effects of parts in fine-grained categorization of vehicles[C]//Proceedings of the 2015 IEEE International Conference on Image Processing, 2015: 745–749. https://doi.org/10.1109/ICIP.2015.7350898.

    [6]

    Hsieh J W, Chen L C, Chen D Y. Symmetrical SURF and its applications to vehicle detection and vehicle make and model recognition[J]. IEEE Trans Intell Transp Syst, 2014, 15(1): 6−20. doi: 10.1109/TITS.2013.2294646

    [7]

    冯建周, 马祥聪. 基于迁移学习的细粒度实体分类方法的研究[J]. 自动化学报, 2020, 46(8): 1759−1766. doi: 10.16383/j.ass.c190041

    Feng J Z, Ma X C. Fine-grained entity type classification based on transfer learning[J]. Acta Autom Sin, 2020, 46(8): 1759−1766. doi: 10.16383/j.ass.c190041

    [8]

    罗建豪, 吴建鑫. 基于深度卷积特征的细粒度图像分类研究综述[J]. 自动化学报, 2017, 43(8): 1306−1318. doi: 10.16383/j.aas.2017.c160425

    Luo J H, Wu J X. A survey on fine-grained image categorization using deep convolutional features[J]. Acta Autom Sin, 2017, 43(8): 1306−1318. doi: 10.16383/j.aas.2017.c160425

    [9]

    汪荣贵, 姚旭晨, 杨娟, 等. 基于深度迁移学习的微型细粒度图像分类[J]. 光电工程, 2019, 46(6): 180416. doi: 10.12086/oee.2019.180416

    Wang R G, Yao X C, Yang J, et al. Deep transfer learning for fine-grained categorization on micro datasets[J]. Opto-Electron Eng, 2019, 46(6): 180416. doi: 10.12086/oee.2019.180416

    [10]

    Wei X S, Song Y Z, Aodha O M, et al. Fine-grained image analysis with deep learning: a survey[J]. IEEE Trans Pattern Anal Mach Intell, 2022, 44(12): 8927−8948. doi: 10.1109/TPAMI.2021.3126648

    [11]

    Yang Z, Luo T G, Wang D, et al. Learning to navigate for fine-grained classification[C]//Proceedings of the 15th European Conference on Computer Vision, 2018: 438–454. https://doi.org/10.1007/978-3-030-01264-9_26.

    [12]

    Fang J, Zhou Y, Yu Y, et al. Fine-grained vehicle model recognition using a coarse-to-fine convolutional neural network architecture[J]. IEEE Trans Intell Transp Systems, 2017, 18(7): 1782−1792. doi: 10.1109/TITS.2016.2620495

    [13]

    Zhang X P, Xiong H K, Zhou W G, et al. Fused one-vs-all features with semantic alignments for fine-grained visual categorization[J]. IEEE Trans Image Process, 2016, 25(2): 878−892. doi: 10.1109/TIP.2015.2509425

    [14]

    Xu H P, Qi G L, Li J J, et al. Fine-grained image classification by visual-semantic embedding[C]//Proceedings of the 27th International Joint Conference on Artificial Intelligence, 2018: 1043–1049. https://doi.org/10.5555/3304415.3304563.

    [15]

    Zhang H, Xu T, Elhoseiny M, et al. SPDA-CNN: Unifying semantic part detection and abstraction for fine-grained recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016: 1143–1152. https://doi.org/10.1109/CVPR.2016.129.

    [16]

    Ding Y F, Ma Z Y, Wen S G, et al. AP-CNN: weakly supervised attention pyramid convolutional neural network for fine-grained visual classification[J]. IEEE Trans Image Process, 2021, 30: 2826−2836. doi: 10.1109/TIP.2021.3055617

    [17]

    Rao Y M, Chen G Y, Lu J W, et al. Counterfactual attention learning for fine-grained visual categorization and re-identification[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021: 1005–1014. https://doi.org/10.1109/ICCV48922.2021.00106.

    [18]

    Hu T, Qi H G, Huang Q M, et al. See better before looking closer: weakly supervised data augmentation network for fine-grained visual classification[Z]. arXiv: 1901.09891, 2019. https://doi.org/10.48550/arXiv.1901.09891.

    [19]

    Lin T Y, RoyChowdhury A, Maji S. Bilinear CNN models for fine-grained visual recognition[C]//Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015: 1449–1457. https://doi.org/10.1109/ICCV.2015.170.

    [20]

    Yu C J, Zhao X Y, Zheng Q, et al. Hierarchical bilinear pooling for fine-grained visual recognition[C]//Proceedings of the 15th European Conference on Computer Vision, 2018: 595–610. https://doi.org/10.1007/978-3-030-01270-0_35.

    [21]

    Gao Y, Beijbom O, Zhang N, et al. Compact bilinear pooling[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016: 317–326. https://doi.org/10.1109/CVPR.2016.41.

    [22]

    Kong S, Fowlkes C. Low-rank bilinear pooling for fine-grained classification[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017: 7025–7034. https://doi.org/10.1109/CVPR.2017.743.

    [23]

    Sun M, Yuan Y C, Zhou F, et al. Multi-attention multi-class constraint for fine-grained image recognition[C]//15th European Conference on Computer Vision, 2018: 834–850. https://doi.org/10.1007/978-3-030-01270-0_49.

    [24]

    Zheng X W, Ji R R, Sun X S, et al. Towards optimal fine grained retrieval via decorrelated centralized loss with normalize-scale layer[C]//Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, 2019: 1140. https://doi.org/10.1609/aaai.v33i01.33019291.

    [25]

    He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 770–778. https://doi.org/10.1109/CVPR.2016.90.

    [26]

    Du R Y, Cheng D L, Bhunia A K, et al. Fine-grained visual classification via progressive multi-granularity training of jigsaw patches[C]//16th European Conference on Computer Vision, 2020: 153–168. https://doi.org/10.1007/978-3-030-58565-5_10.

    [27]

    Choe J, Shim H. Attention-based dropout layer for weakly supervised object localization[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 2214–2223. https://doi.org/10.1109/CVPR.2019.00232.

    [28]

    Krause J, Stark J, Deng L, et al. 3D object representations for fine-grained categorization[C]//2013 IEEE International Conference on Computer Vision Workshops, 2013: 554–561. https://doi.org/10.1109/ICCVW.2013.77.

    [29]

    Yang L J, Luo P, Loy C C, et al. A large-scale car dataset for fine-grained categorization and verification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015: 3973–3981. https://doi.org/10.1109/CVPR.2015.7299023.

    [30]

    Ali M, Tahir M A, Durrani M N. Vehicle images dataset for make and model recognition[J]. Data Brief, 2022, 42: 108107. doi: 10.1016/J.DIB.2022.108107

    [31]

    Song J W, Yang R Y. Feature boosting, suppression, and diversification for fine-grained visual classification[C]//International Joint Conference on Neural Networks, 2021: 1–8. https://doi.org/10.1109/IJCNN52387.2021.9534004.

    [32]

    Zhou M H, Bai Y L, Zhang W, et al. Look-into-object: self-supervised structure modeling for object recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 11771–11780. https://doi.org/10.1109/CVPR42600.2020.01179.

    [33]

    Chen Y, Bai Y L, Zhang W, et al. Destruction and construction learning for fine-grained image recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 5152–5161. https://doi.org/10.1109/CVPR.2019.00530.

    [34]

    Luo W, Yang X T, Mo X J, et al. Cross-x learning for fine-grained visual categorization[C]//IEEE/CVF International Conference on Computer Vision, 2019: 8241–8250. https://doi.org/10.1109/ICCV.2019.00833.

    [35]

    Guo C Y, Xie J Y, Liang K M, et al. Cross-layer navigation convolutional neural network for fine-grained visual classification[C]//ACM Multimedia Asia, 2021: 49. https://doi.org/10.1145/3469877.3490579.

  • 加载中

(10)

(5)

计量
  • 文章访问数:  2142
  • PDF下载数:  1064
  • 施引文献:  0
出版历程
收稿日期:  2023-03-05
修回日期:  2023-05-23
录用日期:  2023-06-05
刊出日期:  2023-08-20

目录

/

返回文章
返回