伪标签细化引导的相机感知无监督行人重识别方法

程思雨; 陈莹

doi:10.12086/oee.2023.230239

伪标签细化引导的相机感知无监督行人重识别方法

程思雨,
陈莹^,

- 江南大学轻工过程先进控制教育部重点实验室物联网工程学院，江苏无锡 214122
基金项目:
国家自然科学基金资助项目(62173160)

详细信息

作者简介:
程思雨(1999-)，女，硕士研究生，主要从事深度学习、行人重识别方面的研究。E-mail: 2446297319@qq.com;

陈莹(1976-)，女，博士，教授，博士生导师，主要从事机器视觉、信息融合、模式识别方面的研究。E-mail: chenying@jiangnan.edu.cn

**^*通讯作者:** 陈莹(1976-)，chenying@jiangnan.edu.cn

中图分类号: TP391

收稿日期: 2023-09-24

修回日期: 2023-12-27

录用日期: 2023-12-27

刊出日期: 2024-01-19

Camera-aware unsupervised person re-identification method guided by pseudo-label refinement

Cheng Siyu,
Chen Ying^,

- Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), School of Internet of Things Engineering, Jiangnan University, Wuxi, Jiangsu 214122, China
Fund Project: Project supported by National Natural Science Foundation of China (62173160)

More Information

**^*Corresponding author:** chenying@jiangnan.edu.cn

Received Date 24 September 2023

Revised Date 27 December 2023

Accepted Date 27 December 2023

Published Date 19 January 2024

摘要

摘要

无监督行人重识别因其广泛的实际应用前景而受到越来越多的关注。大多数基于聚类的对比学习方法将每个集群视为一个伪身份类，忽略了由相机风格差异造成的类内差异。一些方法引入了相机感知对比学习，根据相机视角将单一集群划分为多个子集群，但它们容易受到噪声伪标签的误导。为解决这一问题，本文首先基于实例在特征空间中的相似性，采用最近邻的预测标签和原始聚类结果的加权组合细化伪标签。然后，采用细化伪标签动态地关联实例可能属于的类别中心，同时剔除可能存在的假阴性样本。这一方法改进了相机感知对比学习中正负样本的选择机制，有效地减轻了噪声伪标签对对比学习任务的误导。在Market-1501、MSMT17、Personx数据集上mAP/Rank-1分别达到了85.2%/94.4%、44.3%/74.1%、88.7%/95.9%。
- 行人重识别 /
- 无监督 /
- 相机感知对比学习 /
- 细化伪标签
Abstract

Unsupervised person re-identification has attracted more and more attention due to its extensive practical application prospects. Most clustering-based contrastive learning methods treat each cluster as a pseudo-identity class, overlooking intra-class variances caused by differences in camera styles. While some methods have introduced camera-aware contrastive learning by partitioning a single cluster into multiple sub-clusters based on camera views, they are susceptible to misguidance from noisy pseudo-labels. To address this issue, we first refine pseudo-labels by leveraging the similarity between instances in the feature space, using a weighted combination of the nearest neighboring predicted labels and the original clustering results. Subsequently, it dynamically associates instances with possible category centers based on refined pseudo-labels while eliminating potential false negative samples. This method enhances the selection mechanism for positive and negative samples in camera-aware contrastive learning, effectively mitigating the influence of noisy pseudo-labels on the contrastive learning task. On Market-1501, MSMT17 and Personx datasets, mAP/Rank-1 reached 85.2%/94.4%, 44.3%/74.1% and 88.7%/95.9%.
- person re-identification /
- unsupervised /
- camera-aware contrastive learning /
- refined pseudo-labels

Overview

Overview

Overview: Unsupervised person re-identification has received increasing attention due to its wide practical application prospects. Most clustering-based contrastive learning methods treat each cluster as a pseudo-identity class, focusing on improving inter-class differences while ignoring intra-class differences caused by factors such as perspective, lighting, and background between different cameras. This makes it difficult for clustering algorithms to accurately cluster samples with the same identity into the same cluster, inevitably leading to noisy pseudo-labels. Some methods have introduced camera-aware contrastive learning, which divide a single cluster into multiple sub-clusters based on the camera's perspective, and calculate the intra-camera and inter-camera contrastive loss separately. However, the noise in pseudo-labels may interfere with the selection of positive and negative samples in camera-aware contrastive learning, thereby misleading the model's learning process. To address this issue, this paper proposes a camera-aware unsupervised person re-identification method guided by refined pseudo-labels. By calculating the similarity between training instances in feature space, a neighborhood set is determined for each instance. Subsequently, the model refines one-hot pseudo-labels by combining the predicted labels for samples within the neighborhood with the original clustering results using weighted aggregation. The core idea behind this approach is to encourage the model to not only bring samples closer to their respective cluster centers but also establish associations with other nearby samples that may contain identity information. This strategy effectively enhances the model's robustness against noisy labels while reducing the risk of over-fitting. Building upon this, this paper further proposes camera-aware contrastive learning guided by refined pseudo-labels. By leveraging the probability distribution of each class in the refined pseudo-labels for instances, the model dynamically associates instances with potential class centers, no longer relying on a single class center as the positive sample. Additionally, potential false positive and false negative samples are filtered out. This method enhances the selection mechanism of positive and negative samples in camera-aware contrastive learning, effectively mitigating the influence of noisy pseudo-labels on the contrastive learning task. The method proposed in this article was validated on three large-scale public datasets, and the results showed that this method has significantly improved compared to the baseline method and is superior to current advanced methods in the same field. This method achieved mAP/Rank-1 of 85.2%/94.4%, 44.3%/74.1%, and 88.7%/95.9% on the Market-1501, MSMT17, and Personx datasets, respectively, demonstrating superiority. Specifically, on the Market-1501, MSMT17, and Personx datasets, this paper’s method achieves mAP/Rank-1 scores of 85.2%/94.4%, 44.3%/74.1%, and 88.7%/95.9%, respectively, showcasing its superiority.

HTML全文

图 1 网络整体框架

Figure 1. The overall framework of our method

下载: 全尺寸图片幻灯片

图 2 邻域伪标签细化模块

Figure 2. Neighborhood pseudo label refinement module

下载: 全尺寸图片幻灯片

图 3 细化伪标签引导相机感知示意图。(a) 原始相机内对比; (b) 校正后相机内对比; (c) 原始相机间对比; (d) 校正后相机间对比

Figure 3. Schematic diagram of camera-aware guided by refined pseudo-labels. (a) Original intra-camera contrast; (b) Corrected intra-camera contrast; (c) Original inter-camera contrast; (d) Corrected inter-camera contrast

下载: 全尺寸图片幻灯片

图 4 不同方法在Market-1501数据集上Top-10排序列表的比较。(a) Baseline方法; (b) CAP^[9]方法; (c) PPLR^[15]方法; (d) 本文方法

Figure 4. Comparison of Top-10 ranking lists between on Market-1501 dataset among different methods. (a) Baseline method; (b) CAP^[9] method; (c) PPLR^[15] method; (d) Our method

下载: 全尺寸图片幻灯片

图 5 不同方法在Market-1501数据集上特征T-SNE可视化结果。(a) Baseline方法; (b) CAP^[9]方法; (c) PPLR^[15]方法; (d) 本文方法

Figure 5. Feature T-SNE visualization results of different methods on Market-1501 dataset. (a) Baseline method; (b) CAP^[9] method; (c) PPLR^[15] method; (d) Our method

下载: 全尺寸图片幻灯片

图 6 各超参数在Market-1501数据集上对本文模型的影响。(a) $\alpha $; (b) $k $; (c) $\tau_{\rm{intra}} $; (d) $\tau_{\rm{inter}} $

Figure 6. The impact of each hyperparameter to our model on Market-1501. (a) $\alpha $; (b) $k $; (c) $\tau_{\rm{intra}} $; (d) $\tau_{\rm{inter}} $

下载: 全尺寸图片幻灯片

表 1 本文方法与最新方法的比较

Table 1. The comparison between the our method and the latest methods

Methods		Market-1501		MSMT17		Personx
Methods		mAP/%	Rank-1/%	mAP/%	Rank-1/%	mAP/%	Rank-1/%
Unsupervised domain adaptation (UDA)
ECN^[2]	CVPR’19	43.0	75.1	10.2	30.2	-	-
SPCL^[3]	NeurIPS’20	77.5	89.7	26.8	53.7	78.5	91.1
MEB-Net^[13]	ECCV’20	76.0	89.9	-	-	-	-
MMT^[12]	CVPR’21	71.2	87.7	23.5	50.0	78.9	90.6
GLT^[31]	CVPR’21	79.5	92.2	27.7	59.5	-	-
MCL^[32]	JBUAA’22	80.6	93.2	28.5	58.5	-	-
CACHE^[33]	TCSVT’22	83.1	93.4	31.3	58.0	-	-
CIFL^[20]	TMM’22	83.3	93.9	39.0	70.5	-	-
MCRN^[16]	AAAI’22	83.8	93.8	35.7	67.5	-	-
IICM^[34]	JCRD’23	74.9	89.0	27.2	52.3	-	-
NPSS^[21]	TIFS’23	84.6	94.1	38.9	69.4	-	-
CaCL^[11]	ICCV’23	84.7	93.8	40.3	70.0	-	-
Unsupervised learning (USL)
SPCL^[3]	NeurIPS’20	73.1	88.1	19.1	42.3	72.3	88.1
MetaCam^[7]	CVPR’21	61.7	83.9	15.5	35.2	-	-
IICS^[8]	CVPR’21	72.9	89.5	26.9	56.4	-	-
RLCC^[14]	CVPR’21	77.7	90.8	27.9	56.5	-	-
CAP^[9]	AAAI’21	79.2	91.4	36.9	67.4	-	-
ICE^[17]	ICCV’21	82.3	93.8	38.9	70.2	-	-
CACHE^[33]	TCSVT’22	81.0	92.0	31.8	58.2	-	-
CIFL^[20]	TMM’22	82.4	93.9	38.8	70.1	-	-
GRACL^[35]	TCSVT’22	83.7	93.2	34.6	64.0	87.9	95.3
PPLR^[15]	CVPR’22	84.4	94.3	42.2	73.3	-	-
CA-UReID^[10]	ICME’22	84.5	94.1	-	-	-
NPSS^[21]	TIFS’23	82.3	94.0	36.7	68.8	-	-
LRMGFS^[36]	JEMI’23	83.3	93.3	27.4	58.4	-	-
PLRIS^[22]	ICIP’23	83.2	93.1	43.3	71.5	-	-
AdaMG^[37]	TCSVT’23	84.6	93.9	38.0	66.3	87.6	95.0
LP^[18]	TIP’23	85.8	94.5	39.5	67.9	-	-
DCCT^[19]	TCSVT’’23	86.3	94.4	41.8	68.7	87.6	95.0
CC^[4]	CoRR’21	82.1	92.3	27.6	56.0	84.7	94.4
Ours	-	85.2	94.4	44.3	74.1	88.7	95.9

下载: 导出CSV

表 2 Market-1501数据集上消融实验结果

Table 2. Results of ablation studies on Market-1501

	${{\widetilde {\rm{L}} }_{ {\text{ce} } } }$	${ {\widetilde {\rm{L}} }_{ {\text{intra} } } }$	${ {\widetilde {\rm{L}} }_{ {\text{inter} } } }$	${\rm{Market} }\text{-}1501$				${\rm{Personx}} $
	${{\widetilde {\rm{L}} }_{ {\text{ce} } } }$	${ {\widetilde {\rm{L}} }_{ {\text{intra} } } }$	${ {\widetilde {\rm{L}} }_{ {\text{inter} } } }$	mAP/%	Rank-1/%	Rank-5/%	Rank-10/%	mAP/%	Rank-1/%	Rank-5/%	Rank-10/%
M1(CC^[4])	-	-	-	82.1	92.3	96.7	97.9	84.7	94.4	98.3	99.3
M2	-	√	-	82.9	93.2	97.3	98.2	87.9	95.7	98.8	99.5
M3	-	-	√	83.9	93.6	97.5	98.3	87.3	95.9	98.8	99.4
M4	-	√	√	84.1	93.6	97.6	98.5	88.5	95.8	98.9	99.6
M5	√	-	-	84.7	94.2	97.9	98.7	87.5	95.5	98.9	99.5
M6(Ours)	√	√	√	85.2	94.4	98.1	98.6	88.7	95.9	99.2	99.7

下载: 导出CSV

参考文献(38)

参考文献

[1]	张晓艳, 张宝华, 吕晓琪, 等. 深度双重注意力的生成与判别联合学习的行人重识别[J]. 光电工程, 2021, 48(5): 200388. doi: 10.12086/oee.2021.200388 Zhang X Y, Zhang B H, Lv X Q, et al. The joint discriminative and generative learning for person re-identification of deep dual attention[J]. Opto-Electron Eng, 2021, 48(5): 200388. doi: 10.12086/oee.2021.200388
[2]	Zhong Z, Zheng L, Luo Z M, et al. Invariance matters: exemplar memory for domain adaptive person re-identification[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 598–607. https://doi.org/10.1109/CVPR.2019.00069.
[3]	Ge Y X, Zhu F, Chen D P, et al. Self-paced contrastive learning with hybrid memory for domain adaptive object re-ID[C]//Proceedings of the 34th International Conference on Neural Information Processing Systems, 2020: 949. https://doi.org/10.5555/3495724.3496673.
[4]	Dai Z Z, Wang G Y, Yuan W H, et al. Cluster contrast for unsupervised person re-identification[C]//Proceedings of the 16th Asian Conference on Computer Vision, 2023: 319–337. https://doi.org/10.1007/978-3-031-26351-4_20.
[5]	Tian J J, Tang Q H, Li R, et al. A camera identity-guided distribution consistency method for unsupervised multi-target domain person re-identification[J]. ACM Trans Intell Syst Technol, 2021, 12(4): 38. doi: 10.1145/3454130
[6]	Choi Y, Choi M, Kim M, et al. StarGAN: Unified generative adversarial networks for multi-domain image-to-image translation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018: 8789–8797. https://doi.org/10.1109/CVPR.2018.00916.
[7]	Yang F X, Zhong Z, Luo Z M, et al. Joint noise-tolerant learning and meta camera shift adaptation for unsupervised person re-identification[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 4853–4862. https://doi.org/10.1109/CVPR46437.2021.00482.
[8]	Xuan S Y, Zhang S L. Intra-inter camera similarity for unsupervised person re-identification[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 11921–11930. https://doi.org/10.1109/CVPR46437.2021.01175.
[9]	Wang M L, Lai B S, Huang J Q, et al. Camera-aware proxies for unsupervised person re-identification[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2021: 2764–2772. https://doi.org/10.1609/aaai.v35i4.16381.
[10]	Li X, Liang T F, Jin Y, et al. Camera-aware style separation and contrastive learning for unsupervised person re-identification[C]//2022 IEEE International Conference on Multimedia and Expo, 2022: 1–6. https://doi.org/10.1109/ICME52920.2022.9859842.
[11]	Lee G, Lee S, Kim D, et al. Camera-driven representation learning for unsupervised domain adaptive person re-identification[Z]. arXiv: 2308.11901, 2023. https://doi.org/10.48550/arXiv.2308.11901.
[12]	Ge Y X, Chen D P, Li H S. Mutual mean-teaching: pseudo label refinery for unsupervised domain adaptation on person re-identification[C]//8th International Conference on Learning Representations, 2020.
[13]	Zhai Y P, Ye Q X, Lu S J, et al. Multiple expert brainstorming for domain adaptive person re-identification[C]//16th European Conference on Computer Vision, 2020: 594–611. https://doi.org/10.1007/978-3-030-58571-6_35.
[14]	Zhang X, Ge Y X, Qiao Y, et al. Refining pseudo labels with clustering consensus over generations for unsupervised object re-identification[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 3435–3444. https://doi.org/10.1109/CVPR46437.2021.00344.
[15]	Cho Y, Kim W J, Hong S, et al. Part-based pseudo label refinement for unsupervised person re-identification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2022: 7298–7308. https://doi.org/10.1109/CVPR52688.2022.00716.
[16]	Wu Y H, Huang T T, Yao H T, et al. Multi-centroid representation network for domain adaptive person re-ID[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2022: 2750–2758. https://doi.org/10.1609/aaai.v36i3.20178.
[17]	Chen H, Lagadec B, Bremond F. ICE: inter-instance contrastive encoding for unsupervised person re-identification[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021: 14940–14949. https://doi.org/10.1109/ICCV48922.2021.01469.
[18]	Lan L, Teng X, Zhang J, et al. Learning to purification for unsupervised person re-identification[J]. IEEE Trans Image Process, 2023, 33: 3338−3353. doi: 10.1109/TIP.2023.3278860
[19]	Chen Z Q, Cui Z C, Zhang C, et al. Dual clustering co-teaching with consistent sample mining for unsupervised person re-identification[J]. IEEE Trans Circuits Syst Video Technol, 2023, 33(10): 5908−5920. doi: 10.1109/TCSVT.2023.3261898
[20]	Pang Z Q, Zhao L L, Liu Q Y, et al. Camera invariant feature learning for unsupervised person re-identification[J]. IEEE Trans Multimed, 2023, 25: 6171−6182. doi: 10.1109/TMM.2022.3206662
[21]	Wang H J, Yang M, Liu J L, et al. Pseudo-label noise prevention, suppression and softening for unsupervised person re-identification[J]. IEEE Trans Inf Forensics Secur, 2023, 18: 3222−3237. doi: 10.1109/TIFS.2023.3277694
[22]	Li P N, Wu K Y, Zhou S P, et al. Pseudo labels refinement with intra-camera similarity for unsupervised person re-identification[C]//2023 IEEE International Conference on Image Processing, 2023: 366–370. https://doi.org/10.1109/ICIP49359.2023.10222317.
[23]	Zheng L, Shen L Y, Tian L, et al. Scalable person re-identification: a benchmark[C]//Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015: 1116–1124. https://doi.org/10.1109/ICCV.2015.133.
[24]	Wei L H, Zhang S L, Gao W, et al. Person transfer GAN to bridge domain gap for person re-identification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 79–88. https://doi.org/10.1109/CVPR.2018.00016.
[25]	Sun X X, Zheng L. Dissecting person re-identification from the viewpoint of viewpoint[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 608–617. https://doi.org/10.1109/CVPR.2019.00070.
[26]	Ester M, Kriegel H P, Sander J, et al. A density-based algorithm for discovering clusters in large spatial databases with noise[C]//Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, 1996: 226–231. https://doi.org/10.5555/3001460.3001507.
[27]	Zhou D Y, Bousquet O, Lal T N, et al. Learning with local and global consistency[C]//Proceedings of the 16th International Conference on Neural Information Processing Systems, 2003: 321–328. https://doi.org/10.5555/2981345.2981386.
[28]	Deng J, Dong W, Socher R, et al. ImageNet: a large-scale hierarchical image database[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2009: 248–255. https://doi.org/10.1109/CVPR.2009.5206848.
[29]	He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 770–778. https://doi.org/10.1109/CVPR.2016.90.
[30]	Zhong Z, Zheng L, Cao D L, et al. Re-ranking person re-identification with k-reciprocal encoding[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 3652–3661. https://doi.org/10.1109/CVPR.2017.389.
[31]	Zheng K C, Liu W, He L X, et al. Group-aware label transfer for domain adaptive person re-identification[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 5306–5315. https://doi.org/10.1109/CVPR46437.2021.00527.
[32]	李慧, 张晓伟, 赵新鹏, 等. 基于多标签协同学习的跨域行人重识别[J]. 北京航空航天大学学报, 2022, 48(8): 1534−1542. doi: 10.13700/j.bh.1001-5965.2021.0600 Li H, Zhang X W, Zhao X P, et al. Multi-label cooperative learning for cross domain person re-identification[J]. J Beijing Univ Aeronaut Astronaut, 2022, 48(8): 1534−1542. doi: 10.13700/j.bh.1001-5965.2021.0600
[33]	Liu Y X, Ge H W, Sun L, et al. Complementary attention-driven contrastive learning with hard-sample exploring for unsupervised domain adaptive person re-ID[J]. IEEE Trans Circuits Syst Video Technol, 2023, 33(1): 326−341. doi: 10.1109/TCSVT.2022.3200671
[34]	陈利文, 叶锋, 黄添强, 等. 基于摄像头域内域间合并的无监督行人重识别方法[J]. 计算机研究与发展, 2023, 60(2): 415−425. doi: 10.7544/issn1000-1239.202110732 Chen L W, Ye F, Huang T Q, et al. An unsupervised person re-Identification method based on intra-/inter-camera merger[J]. J Comput Res Dev, 2023, 60(2): 415−425. doi: 10.7544/issn1000-1239.202110732
[35]	Zhang H W, Zhang G Q, Chen Y H, et al. Global relation-aware contrast learning for unsupervised person re-identification[J]. IEEE Trans Circuits Syst Video Technol, 2022, 32(12): 8599−8610. doi: 10.1109/TCSVT.2022.3194084
[36]	钱亚萍, 王凤随, 熊磊. 基于局部细化多分支与全局特征共享的无监督行人重识别方法[J]. 电子测量与仪器学报, 2023, 37(1): 106−115. doi: 10.13382/j.jemi.B2205837 Qian Y P, Wang F S, Xiong L. Unsupervised person re-identification method based on local refinement multi-branch and global feature sharing[J]. J Electron Meas Instrum, 2023, 37(1): 106−115. doi: 10.13382/j.jemi.B2205837
[37]	Peng J J, Jiang G Q, Wang H B. Adaptive memorization with group labels for unsupervised person re-identification[J]. IEEE Trans Circuits Syst Video Technol, 2023, 33(10): 5802−5813. doi: 10.1109/TCSVT.2023.3258917
[38]	Wang F, Liu H P. Understanding the behaviour of contrastive loss[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 2495–2504. https://doi.org/10.1109/CVPR46437.2021.00252.

施引文献

资源附件(0)

访问统计

访问统计

点击扫一扫

图(7)

表(2)

计量

文章访问数:
PDF下载数:
施引文献: 0

伪标签细化引导的相机感知无监督行人重识别方法

**^*通讯作者:** 陈莹(1976-)，chenying@jiangnan.edu.cn

Camera-aware unsupervised person re-identification method guided by pseudo-label refinement

**^*Corresponding author:** chenying@jiangnan.edu.cn

摘要

Abstract

Overview

参考文献

访问统计

计量

目录

作者须知

其他内容

条款和政策

伪标签细化引导的相机感知无监督行人重识别方法

*通讯作者: 陈莹(1976-)，chenying@jiangnan.edu.cn

Camera-aware unsupervised person re-identification method guided by pseudo-label refinement

*Corresponding author: chenying@jiangnan.edu.cn

摘要

Abstract

Overview

参考文献

访问统计

计量

出版历程

目录

作者须知

其他内容

条款和政策

**^*通讯作者:** 陈莹(1976-)，chenying@jiangnan.edu.cn

**^*Corresponding author:** chenying@jiangnan.edu.cn