loading page

Weakly Correlated Multimodal Sentiment Analysis: New Dataset and Topic-oriented Model
  • +7
  • Wuchao Liu ,
  • Wengen Li ,
  • Yu-Ping Ruan ,
  • Yulou Shu ,
  • Juntao Chen ,
  • Yina Li ,
  • Caili Yu ,
  • Yichao zhang ,
  • Jihong Guan ,
  • Shuigeng Zhou
Wuchao Liu
Author Profile
Wengen Li
Author Profile
Yu-Ping Ruan
Author Profile
Yulou Shu
Author Profile
Juntao Chen
Author Profile
Yichao zhang
Author Profile
Jihong Guan
Author Profile
Shuigeng Zhou
Author Profile


Existing multimodal sentiment analysis models focus more on fusing highly correlated image-text pairs, and thus achieves unsatisfactory performance on multimodal social media data which usually manifests weak correlations between different modalities. To address this issue, we first build a large multimodal social media sentiment analysis dataset RU-Senti which contains more than 100,000 image-text pairs with sentiment labels. Then, we proposed a topic-oriented model (TOM) which assumes that text is usually related to a certain portion of the image contents and the image-text pairs of the same topic often have similar sentiment tendencies. TOM learns the topic information from textual content and designs a topic-oriented feature alignment module to extract textual semantics correlated information from images, thus achieving the alignment between two modalities. Then, TOM utilizes a transformer encoder initialized with the parameters from a pre-trained vision-language model to fuse the multimodal features for sentiment prediction. According to the experiments over the public MVSA-Multiple dataset and our RU-Senti dataset, RU-Senti is of high suitability for studying weakly correlated multimodal sentiment analysis, and the proposed TOM model also largely outperforms the SOTA mulitimodal sentiment analysis methods and pre-trained vision-language models. The RU-Senti dataset and the code of TOM are available at https://github.com/PhenoixYANG/TOM.