Zhiqi Pang

and 4 more

Visible-infrared person re-identification (VI-ReID) is a challenging cross-modality image retrieval task. Compared to visible modality person re-identification that handles only the intra-modality discrepancy, VI-ReID suffers from an additional modality gap. Most existing VI-ReID methods achieve promising accuracy in a supervised setting, but the high annotation cost limits their scalability to real-world scenarios. Although a few unsupervised VI-ReID methods already exist, they typically rely on intra-modality initialization and cross-modality instance selection, despite the additional computational time required for intra-modality initialization. In this paper, we study the fully unsupervised VI-ReID problem and propose a novel cross-modality hierarchical clustering and refinement (CHCR) method by promoting modality-invariant feature learning and improving the reliability of pseudo-labels. Unlike conventional VI-ReID methods, CHCR does not rely on any manual identity annotation and intra-modality initialization. First, we design a simple and effective cross-modality clustering baseline that clusters between modalities. Then, to provide sufficient inter-modality positive sample pairs for modality-invariant feature learning, we propose a cross-modality hierarchical clustering algorithm to promote the clustering of inter-modality positive samples into the same cluster. In addition, we develop an inter-channel pseudo-label refinement algorithm to eliminate unreliable pseudo-labels by checking the clustering results of three channels in the visible modality. Extensive experiments demonstrate that CHCR outperforms state-of-the-art unsupervised methods and achieves performance competitive with many supervised methods.

Irving Barron

and 1 more

We present optimized modulation and coding for the recently introduced dual modulated QR (DMQR) codes that extend traditional QR codes to carry additional secondary data in the orientation of elliptical dots that replace black modules in the barcode images. By dynamically adjusting the dot size, we realize gains in embedding strength for both the intensity modulation and the orientation modulation that carry the primary and secondary data, respectively. Furthermore, we develop a model for the coding channel for the secondary data that enables soft-decoding via 5G NR (new radio) codes already supported by mobile devices. The performance gains for the proposed optimized designs are characterized via theoretical analysis, simulations, and actual experiments using smartphone devices. The theoretical analysis and simulations inform our design choices for the modulation and coding, and the experiments characterize the overall improvement in performance for the optimized design over the prior unoptimized designs. Importantly, the optimized designs significantly increase usability of DMQR codes with commonly used QR code beautification that cannibalizes a portion of the barcode image area for the insertion of a logo or image. In experiments with a capture distance of 15 inches, the optimized designs increase the decoding success rates between 10% and 32% for the secondary data while also providing gains for primary data decoding at larger capture distances. When used with beautification in typical settings, the secondary message is decoded with a high success rate for the proposed optimized designs, whereas it invariably fails for the prior unoptimized designs.

Li Ding

and 5 more

We propose a novel hybrid framework for registering retinal images in the presence of extreme geometric distortions that are commonly encountered in ultra-widefield (UWF) fluorescein angiography. Our approach consists of two stages: a feature-based global registration and a vessel-based local refinement. For the global registration, we introduce a modified RANSAC algorithm that jointly identifies robust matches between feature keypoints in reference and target images and estimates a polynomial geometric transformation consistent with the identified correspondences. Our RANSAC modification particularly improves feature point matching and the registration in peripheral regions that are most severely impacted by the geometric distortions. The second local refinement stage is formulated in our framework as a parametric chamfer alignment for vessel maps obtained using a deep neural network. Because the complete vessel maps contribute to the chamfer alignment, this approach not only improves registration accuracy but also aligns with clinical practice, where vessels are typically a key focus of examinations. We validate the effectiveness of the proposed framework on a new UWF fluorescein angiography (FA) dataset and on the existing narrow-field FIRE (fundus image registration) dataset and demonstrate that it significantly outperforms prior retinal image registration methods. The proposed approach enhances the utility of large sets of longitudinal UWF images by enabling: (a) automatic computation of vessel change metrics and (b) standardized and co-registered examination that can better highlight changes of clinical interest to physicians.

Yue Zhao

and 2 more

We develop and evaluate an automated data-driven framework for providing reviewer recommendations for submitted manuscripts. Given inputs comprising a set of manuscripts for review and a listing of a pool of prospective reviewers, our system uses a publisher database to extract papers authored by the reviewers from which a Paragraph Vector (doc2vec ) neural network model is learned and used to obtain vector space embeddings of documents. Similarities between embeddings of an individual reviewer’s papers and a manuscript are then used to compute manuscript-reviewer match scores and to generate a ranked list of recommended reviewers for each manuscript. Our mainline proposed system uses full text versions of the reviewers’ papers, which we demonstrate performs significantly better than models developed based on abstracts alone, which has been the predominant paradigm in prior work. Direct retrieval of reviewer’s manuscripts from a publisher database reduces reviewer burden, ensures up-to-date data, and eliminates the potential for misuse through data manipulation. We also propose a useful evaluation methodology that addresses hyperparameter selection and enables indirect comparisons with alternative approaches and on prior datasets. Finally, the work also contributes a large scale retrospective reviewer matching dataset and evaluation that we hope will be useful for further research in this field. Our system is quite effective; for the mainline approach, expert judges rated 38% of the recommendations as Very Relevant; 33% as Relevant; 24% as Slightly Relevant; and only 5% as Irrelevant.

Gaurav Sharma

and 1 more

Li Ding

and 4 more

We propose a deep-learning based annotation efficient framework for vessel detection in ultra-widefield (UWF) fundus photography (FP) that does not require de novo labeled UWF FP vessel maps. Our approach utilizes concurrently captured UWF fluorescein angiography (FA) images, for which effective deep learning approaches have recently become available, and iterates between a multi-modal registration step and a weakly-supervised learning step. In the registration step, the UWF FA vessel maps detected with a pre-trained deep neural network (DNN) are registered with the UWF FP via parametric chamfer alignment. The warped vessel maps can be used as the tentative training data but inevitably contain incorrect (noisy) labels due to the differences between FA and FP modalities and the errors in the registration. In the learning step, a robust learning method is proposed to train DNNs with noisy labels. The detected FP vessel maps are used for the registration in the following iteration. The registration and the vessel detection benefit from each other and are progressively improved. Once trained, the UWF FP vessel detection DNN from the proposed approach allows FP vessel detection without requiring concurrently captured UWF FA images. We validate the proposed framework on a new UWF FP dataset, PRIMEFP20, and on existing narrow field FP datasets. Experimental evaluation, using both pixel wise metrics and the CAL metrics designed to provide better agreement with human assessment, shows that the proposed approach provides accurate vessel detection, without requiring manually labeled UWF FP training data.