A_Decoupling_Paradigm_with_Prompt_Learning_for_Remote_Sensing_Image_Change_Captioning.pdf (4.98 MB)

A Decoupling Paradigm with Prompt Learning for Remote Sensing Image Change Captioning

Download (4.98 MB)
posted on 2023-06-07, 02:21 authored by Chenyang LiuChenyang Liu, Rui Zhao, Jianqi Chen, Zipeng Qi, Zhengxia Zou, Zhenwei Shi

Remote sensing image change captioning (RSICC) is a novel task that aims to describe the differences between bi-temporal images by natural language. Previous methods ignore a significant specificity of the task: the difficulty of RSICC is different for unchanged and changed image pairs. They process the unchanged and changed image pairs in a coupled way, which usually causes confusion for change captioning. In this paper, we decouple the task into two issues to ease it: whether and what changes have occurred. An image-level classifier performs binary classification to address the first issue. A feature-level encoder contributes to extracting discriminative features to help the caption generation module address the second issue. For caption generation, we utilize prompt learning to introduce pre-trained large language models (LLMs) into the RSICC task. A multi-prompt learning strategy is proposed to generate a set of unified prompts and a class-specific prompt conditioned on the image-level classifier's results. It can prompt a pre-trained LLM to know whether changes exist and generate captions. Finally, the multiple prompts and the features of the feature-level encoder are fed into a frozen LLM for captioning. Compared with previous methods, our method can leverage the powerful abilities of the pre-trained LLM in language to generate plausible captions, which is free of training. Extensive experiments show that our method is effective and achieves state-of-the-art performance. Besides, an additional experiment demonstrates that our decoupling paradigm is more promising than the previous coupled paradigm for the RSICC task.


Email Address of Submitting Author

ORCID of Submitting Author


Submitting Author's Institution

the Image Processing Center, School of Astronautics, Beihang University

Submitting Author's Country

  • China

Usage metrics