loading page

Novel Speech-Based Emotion Climate Recognition in Peers' Conversations Incorporating Affect Dynamics and Temporal Convolutional Neural Networks
  • +1
  • Ghada Alhussein ,
  • Mohanad Alkhodari ,
  • Ahsan Khandoker ,
  • Leontios Hadjileontiadis
Ghada Alhussein
Khalifa University of Science and Technology

Corresponding Author:[email protected]

Author Profile
Mohanad Alkhodari
Author Profile
Ahsan Khandoker
Author Profile
Leontios Hadjileontiadis
Author Profile

Abstract

Peers’ conversation provides a domain of rich emotional information. The latter, apart from facial and gestural expressions, it is also naturally conveyed via peers’ speech, contributing to the establishment of a dynamic emotion climate (EC) during their conversational interaction. Recognition of EC could provide an additional source in understating peers’ social interaction and behavior on top of peers’ actual conversational content. Here, we propose a novel approach for speech-based EC recognition, namely AffECt, by combining peers’ complex affect dynamics (AD) with deep features extracted from speech signals using Temporary Convolutional Neural Networks (TCNNs). AffECt was tested and cross-validated on data drawn from there open datasets, i.e., K-EmoCon, IEMOCAP, and SEWA, in terms of EC arousal/valence level classification. The experimental results have shown that AffECt achieves EC classification accuracy up to 83.3% and 80.2% for arousal and valence, respectively, clearly surpassing the results reported in the literature, exhibiting robust performance across different languages. Moreover, there is a distinct improvement when the AD are combined with the TCNN, compared to the baseline deep learning approaches. These results demonstrate the effectiveness of AffECt in speech-based EC recognition, paving the way for many applications, e.g., in patients’ group therapy, negotiations, and emotion-aware mobile applications