深度双重注意力的生成与判别联合学习的行人重识别

张晓艳; 张宝华; 吕晓琪; 谷宇; 王月明; 刘新; 任彦; 李建军

doi:10.12086/oee.2021.200388

深度双重注意力的生成与判别联合学习的行人重识别

- 1.
  内蒙古科技大学信息工程学院，内蒙古自治区包头 014010
- 2.
  内蒙古工业大学信息工程学院，内蒙古自治区呼和浩特 010051
- 3.
  内蒙古自治区模式识别与智能图像处理重点实验室，内蒙古自治区包头 014010
基金项目:
国家自然科学基金资助项目(61962046，62001255，61841204)；内蒙古杰青培育项目(2018JQ02)；内蒙古科技计划项目"面向交通大数据智能分析平台的关键技术研究与实现"(202001)；内蒙古草原英才，内蒙古自治区自然科学基金(2019MS06003，2018MS06018)；教育部"春晖计划"合作科研项目(教外司留1383号)；内蒙古自治区高等学校科学技术研究项目(NJZY145)资助

详细信息

作者简介:
张晓艳(1995-)，女，硕士研究生，主要从事行人重识别的研究。E-mail：1297118658@qq.com

**^*通讯作者:** 张宝华(1981-)，男，博士，教授，硕士生导师，主要从事数字图像处理及应用、目标识别与跟踪的研究. E-mail: zbh_wj2004@imust.cn

中图分类号: TP391

收稿日期: 2020-10-20

修回日期: 2021-03-31

刊出日期: 2021-05-15

The joint discriminative and generative learning for person re-identification of deep dual attention

- 1.
  School of Information Engineering, Inner Mongolia University of Science and Technology, Baotou, Inner Mongolia 014010, China
- 2.
  School of Information Engineering, Mongolia Industrial University, Huhehaote, Inner Mongolia 010051, China
- 3.
  Inner Mongolia Key Laboratory of Pattern Recognition and Intelligent Image Processing, Baotou, Inner Mongolia 014010, China
Fund Project: National Natural Science Foundation (61962046, 61663036, 61841204), Inner Mongolia Outstanding Youth Cultivation Project (2018JQ02), Inner Mongolia Science and Technology Planning Project "Research and Realization of Key Technologies for the Intelligent Analysis Platform of Traffic Big Data"(202001), Inner Mongolia Prairie Talent, Inner Mongolia Youth Science and Technology Innovation Talent Project, Inner Mongolia Autonomous Region Natural Science Fund (2015MS0604, 2018MS06018), and Inner Mongolia Autonomous Region Higher Education Science Funded by the Technical Research Project (NJZY145)

More Information

**^*Corresponding author:** Zhang Baohua. E-mail: zbh_wj2004@imust.cn

Received Date 20 October 2020

Revised Date 31 March 2021

Published Date 15 May 2021

摘要

摘要

在行人重识别任务中存在数据集标注难度大，样本量少，特征提取后细节特征缺失等问题。针对以上问题提出深度双重注意力的生成与判别联合学习的行人重识别。首先，构建联合学习框架，将判别模块嵌入生成模块，实现图像生成和判别端到端的训练，及时将生成图像反馈给判别模块，同时优化生成模块与判别模块。其次，通过相邻的通道注意力模块间连接和相邻空间注意力模块间连接，融合所有通道特征和空间特征，构建深度双重注意力模块，将其嵌入教师模型，使模型能更好地提取行人细节身份特征，提高模型识别能力。实验结果表明，该算法在Market-1501和DukeMTMC-ReID数据集上具有较好的鲁棒性、判别性。
- 行人重识别 /
- 图像生成 /
- 联合学习 /
- 注意力机制 /
- 深度学习
Abstract

In the task of person re-identification, there are problems such as difficulty in labeling datasets, small sample size, and detail feature missing after feature extraction. The joint discriminative and generative learning for person re-identification of the deep dual attention is proposed against the above issues. Firstly, the author constructs a joint learning framework and embeds the discriminative module into the generative module to realize the end-to-end training of image generative and discriminative. Then, the generated pictures are sent to the discriminative module to optimize the generative module and the discriminative module simultaneously. Secondly, according to the connection between the channels of the attention modules and the connection between the attention modules in spaces, it merges all the channel features and spatial features and constructs a deep dual attention module. By embedding the models in the teacher model, the model can better extract the fine-grained features of the objects and improve the recognition ability. The experimental results show that the algorithm has better robustness and discriminative capability on the Market-1501 and the DukeMTMC-ReID datasets.
- person re-identification /
- image generative /
- joint learning /
- attention /
- deep learning

Overview

Overview

Overview: Person re-identification is a technology that uses computer technology to determine whether there is a specific object in an image or video sequence. In the task of person re-identification, there are problems such as difficulty in labeling datasets, small sample size, and detail feature missing after feature extraction.The joint discriminative and generative learning for person re-identification of deep dual attention is proposed against the above issues. Firstly, the author constructs a joint learning framework and embeds the discriminative module into the generative module to realize the end-to-end training of images generative and discriminative. Then the generated pictures are sent to the discriminative module to optimize the generative module and discriminative module simultaneously. Secondly, we construct a deep dual attention module. Through the connection between the channels of the attention modules and the connection between attention modules in spaces, the information collected by the previous attention module is passed to the next attention module. The attention feature is fused with the currently extracted attention feature to ensure that the information between attention modules of the same dimension can flow in a feed-forward manner, effectively avoiding the frequent changes of the information between attention modules. At last, the author merges all channel features and spatial features. Due to the high similarity of the appearance of the images in the dataset, the teacher model recognition becomes more difficult. Therefore, the deep dual attention module is embedded in the teacher model to improve the recognition ability of the teacher model. The teacher model is used to assist the student model to learn the main features. The generated images are used as training samples to force students to learn fine-grained features independent of appearance features. The author merges the main features and fine-grained features to re-identification person. The experimental results show that Rank-1 and mAP used in this paper are superior to the advanced correlation algorithms.

HTML全文

图 1 师生联合网络框架

Figure 1. Framework for teacher- student network

下载: 全尺寸图片幻灯片

图 2 学习率

Figure 2. Learning rate

下载: 全尺寸图片幻灯片

图 3 不同方法的耗时对比

Figure 3. Time-consuming comparison of different methods

下载: 全尺寸图片幻灯片

图 4 注意力机制不同阶段可视化对比结果

Figure 4. Visual contrast results of different stages of attention mechanism.

下载: 全尺寸图片幻灯片

表 1 消融实验

Table 1. Ablation study

Methods	Market1501		DukeMTMC-ReID
Methods	Rank-1	mAP	Rank-1	mAP
Baseline	86.66	65.14	81.32	64.08
Baseline+DA	87.73	69.57	81.23	64.45
Baseline+DDA	90.74	75.05	83.39	65.55
Prime+DDA	94.15	85.44	85.91	74.52
Ours	94.74	86.39	86.49	75.01

下载: 导出CSV

表 2 与主流行人重识别方法精度对比

Table 2. Accuracy comparison with the main popular re-identification method

Methods	Market1501		DukeMTMC-ReID
Methods	Rank-1	mAP	Rank-1	mAP
AACN^[10]	85.9	66.87	76.84	58.25
HA-CNN^[11]	91.2	75.7	80.5	63.8
PBAN^[17]	93.6	81.7	84.7	69.4
GLAD^[18]	89.9	73.9	-	-
PGFA^[19]	91.2	76.8	82.6	65.5
CBN^[20]	91.3	77.3	82.5	67.3
Deep-Person^[21]	92.31	79.58	80.9	64.8
VPM^[22]	93	80.8	83.6	72.6
PN-GAN^[23]	89.4	72.6	73.6	53.2
FD-GAN^[7]	90.5	77.7	80	64.5
Ours	94.74	86.39	86.49	75.01

下载: 导出CSV

表 3 算法测试时间对比结果

Table 3. Comparative results of test time of different methods

Methods	Time/s
Methods	Test time	Per match(19732)
GLAD^[18]	368	0.0186
PGFA^[19]	315	0.0159
CBN^[20]	347	0.0175
Ours	321	0.0162

下载: 导出CSV

参考文献(23)

参考文献

[1]	宋婉茹, 赵晴晴, 陈昌红, 等. 行人重识别研究综述[J]. 智能系统学报, 2017, 12(6): 770-780. https://www.cnki.com.cn/Article/CJFDTOTAL-ZNXT201706002.htm Song W R, Zhao Q Q, Chen C H, et al. Survey on pedestrian re-identification research[J]. CAAI Trans Intell Syst, 2017, 12(6): 770-780. https://www.cnki.com.cn/Article/CJFDTOTAL-ZNXT201706002.htm
[2]	冯霞, 杜佳浩, 段仪浓, 等. 基于深度学习的行人重识别研究综述[J]. 计算机应用研究, 2020, 37(11): 3220-3226, 3240. https://www.cnki.com.cn/Article/CJFDTOTAL-JSYJ202011004.htm Feng X, Du J H, Duan Y N, et al. Research on person re-identification based on deep learning[J]. Appl Res Comput, 2020, 37(11): 3220-3226, 3240. https://www.cnki.com.cn/Article/CJFDTOTAL-JSYJ202011004.htm
[3]	Matsukawa T, Okabe T, Suzuki E, et al. Hierarchical Gaussian descriptor for person re-identification[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 2016: 1363-1372.
[4]	Lisanti G, Masi I, Bagdanov A D, et al. Person re-identification by iterative re-weighted sparse ranking[J]. IEEE Trans Pattern Anal Mach Intell, 2015, 37(8): 1629-1642. doi: 10.1109/TPAMI.2014.2369055
[5]	Zheng Z D, Zheng L, Yang Y. Unlabeled samples generated by GAN improve the person re-identification baseline in vitro[C]//Proceedings of 2017 IEEE International Conference on Computer Vision, Venice, Italy, 2017: 3754-3762.
[6]	Wei L H, Zhang S L, Gao W, et al. Person transfer GAN to bridge domain gap for person re-identification[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018: 79-88.
[7]	Ge Y X, Li Z W, Zhao H Y, et al. FD-GAN: pose-guided feature distilling GAN for robust person re-identification[C]//Proceedings of the 32nd International Conference on Neural Information Processing Systems, Montréal, Canada, 2018: 1222-1233.
[8]	Deng W J, Zheng L, Ye Q X, et al. Image-image domain adaptation with preserved self-similarity and domain-dissimilarity for person re-identification[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018: 994-1003.
[9]	Song C F, Huang Y, Ouyang W L, et al. Mask-guided contrastive attention model for person re-identification[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018: 1179-1188.
[10]	Xu J, Zhao R, Zhu F, et al. Attention-aware compositional network for person re-identification[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018: 2119-2128.
[11]	Li W, Zhu X T, Gong S G. Harmonious attention network for person re-identification[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018: 2285-2294.
[12]	Zheng Z D, Yang X D, Yu Z D, et al. Joint discriminative and generative learning for person re-identification[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 2019: 2138-2147.
[13]	Woo S, Park J, Lee J Y, et al. CBAM: convolutional block attention module[C]//Proceedings of the 15th European Conference on Computer Vision, Munich, Germany, 2018: 3-19.
[14]	Ma X, Guo J D, Tang S H, et al. DCANet: learning connected attentions for convolutional neural networks[Z]. arXiv: 2007.05099, 2020.
[15]	He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 2016: 770-778.
[16]	郭玥秀, 杨伟, 刘琦, 等. 残差网络研究综述[J]. 计算机应用研究, 2020, 37(5): 1292-1297. https://www.cnki.com.cn/Article/CJFDTOTAL-JSYJ202005002.htm Guo Y X, Yang W, Liu Q, et al. Survey of residual network[J]. Appl Res Comput, 2020, 37(5): 1292-1297. https://www.cnki.com.cn/Article/CJFDTOTAL-JSYJ202005002.htm
[17]	Zhong W L, Jiang L F, Zhang T, et al. A part-based attention network for person re-identification[J]. Multimed Tools Appl, 2020, 79(31): 22525-22549. doi: 10.1007/s11042-019-08395-2
[18]	Wei L H, Zhang S L, Yao H T, et al. GLAD: global-local-alignment descriptor for scalable person re-identification[J]. IEEE Trans Multimed, 2019, 21(4): 986-999. doi: 10.1109/TMM.2018.2870522
[19]	Miao J X, Wu Y, Liu P, et al. Pose-guided feature alignment for occluded person re-identification[C]//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 2019: 542-551.
[20]	Zhuang Z J, Wei L H, Xie L X, et al. Rethinking the distribution gap of person re-identification with camera-based batch normalization[C]//Proceedings of the 16th European Conference on Computer Vision, Glasgow, UK, 2020: 140-157.
[21]	Bai X, Yang M K, Huang T T, et al. Deep-person: learning discriminative deep features for person Re-identification[J]. Pattern Recognit, 2020, 98: 107036. doi: 10.1016/j.patcog.2019.107036
[22]	Sun Y F, Xu Q, Li Y L, et al. Perceive where to focus: learning visibility-aware part-level features for partial person re-identification[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 2019: 393-402.
[23]	Qian X L, Fu Y W, Xiang T, et al. Pose-normalized image generation for person re-identification[C]//Proceedings of the 15th European Conference on Computer Vision, Munich, Germany, 2018: 650-667.

施引文献

资源附件(0)

访问统计

访问统计

点击扫一扫

图(4)

表(3)

计量

文章访问数:
PDF下载数:
施引文献: 0

深度双重注意力的生成与判别联合学习的行人重识别

作者简介:
张晓艳(1995-)，女，硕士研究生，主要从事行人重识别的研究。E-mail：1297118658@qq.com

**^*通讯作者:** 张宝华(1981-)，男，博士，教授，硕士生导师，主要从事数字图像处理及应用、目标识别与跟踪的研究. E-mail: zbh_wj2004@imust.cn

The joint discriminative and generative learning for person re-identification of deep dual attention

**^*Corresponding author:** Zhang Baohua. E-mail: zbh_wj2004@imust.cn

摘要

Abstract

Overview

参考文献

访问统计

计量

目录

作者须知

其他内容

条款和政策

深度双重注意力的生成与判别联合学习的行人重识别

作者简介: 张晓艳(1995-)，女，硕士研究生，主要从事行人重识别的研究。E-mail：1297118658@qq.com

*通讯作者: 张宝华(1981-)，男，博士，教授，硕士生导师，主要从事数字图像处理及应用、目标识别与跟踪的研究. E-mail: zbh_wj2004@imust.cn

The joint discriminative and generative learning for person re-identification of deep dual attention

*Corresponding author: Zhang Baohua. E-mail: zbh_wj2004@imust.cn

摘要

Abstract

Overview

参考文献

访问统计

计量

出版历程

目录

作者须知

其他内容

条款和政策

作者简介:
张晓艳(1995-)，女，硕士研究生，主要从事行人重识别的研究。E-mail：1297118658@qq.com

**^*通讯作者:** 张宝华(1981-)，男，博士，教授，硕士生导师，主要从事数字图像处理及应用、目标识别与跟踪的研究. E-mail: zbh_wj2004@imust.cn

**^*Corresponding author:** Zhang Baohua. E-mail: zbh_wj2004@imust.cn