Cross-modality Hierarchical Clustering and Refinement for Unsupervised Visible-Infrared Person Re-Identification
Visible-infrared person re-identification (VI-ReID) is a challenging cross-modality image retrieval task. Compared to visible modality person re-identification that handles only the intra-modality discrepancy, VI-ReID suffers from an additional modality gap. Most existing VI-ReID methods achieve promising accuracy in a supervised setting, but the high annotation cost limits their scalability to real-world scenarios. Although a few unsupervised VI-ReID methods already exist, they typically rely on intra-modality initialization and cross-modality instance selection, despite the additional computational time required for intra-modality initialization. In this paper, we study the fully unsupervised VI-ReID problem and propose a novel cross-modality hierarchical clustering and refinement (CHCR) method by promoting modality-invariant feature learning and improving the reliability of pseudo-labels. Unlike conventional VI-ReID methods, CHCR does not rely on any manual identity annotation and intra-modality initialization. First, we design a simple and effective cross-modality clustering baseline that clusters between modalities. Then, to provide sufficient inter-modality positive sample pairs for modality-invariant feature learning, we propose a cross-modality hierarchical clustering algorithm to promote the clustering of inter-modality positive samples into the same cluster. In addition, we develop an inter-channel pseudo-label refinement algorithm to eliminate unreliable pseudo-labels by checking the clustering results of three channels in the visible modality. Extensive experiments demonstrate that CHCR outperforms state-of-the-art unsupervised methods and achieves performance competitive with many supervised methods.
Funding
National Natural Science Foundation of China (NSFC, Grant no. 62231013, 62272136, 62171164, 61872114 and 62131004)
History
Email Address of Submitting Author
zqpang98@gmail.comSubmitting Author's Institution
Harbin Institute of TechnologySubmitting Author's Country
- China