loading page

MVLP: Multi-perspective Vision-Language Pre-training Model for Ethically Aligned Meme Detection
  • +1
  • Ning Zhang ,
  • Xuan Feng ,
  • Tianlong Gu ,
  • Liang Chang
Ning Zhang
Author Profile
Xuan Feng
Author Profile
Tianlong Gu
Author Profile
Liang Chang
Author Profile


Ethically aligned design aims to ensure that intelligent systems remain human-centric, serving humanity’s values and ethical principles. Moderating content for achieving ethically aligned design is a critical step toward the realization of trustworthy AI. However, current content moderation methods focus on the detection of harmful content and ignore ethical considerations. Likewise, multi-modal meme detection tasks lack designs for more complex ethical discriminations. Memes pose an interesting multi-modal fusion problem, i.e., their understanding requires a very specific combination of information from different modalities (the text and the image), and have become an influential medium for conveying ideas, socio-cultural, and political views. Therefore, ethically aligned meme detection is crucial to investigate modalities and ensure that they are aligned with our social values and goals. In this paper, we propose MVLP, a Multi-perspective Vision-Language Pre-training model for ethically aligned meme detection.