loading page

A Comparison of Humans and Machine Learning Classifiers Categorizing Emotion from Faces with Different Coverings
  • Harisu Abdullahi Shehu,
  • Will N Browne,
  • Hedwig Eisenbarth
Harisu Abdullahi Shehu
School of Engineering and Computer Science, Victoria University of Wellington

Corresponding Author:[email protected]

Author Profile
Will N Browne
School of Electrical Engineering and Robotics, Queensland University of Technology
Hedwig Eisenbarth
School of Psychology, Victoria University of Wellington


Partial face coverings such as sunglasses and face masks unintentionally obscure facial expressions, causing a loss of accuracy when humans and computer systems attempt to categorise emotion. With the rise of soft computing techniques interacting with humans, it is important to know not just their accuracy, but also the confusion errors being made-do humans make less random/damaging errors than soft computing? We analyzed the impact of sunglasses and different face masks on the ability to categorize emotional facial expressions in humans and computer systems. Computer systems, represented by VGG19, ResNet50, and InceptionV3 deep learning algorithms, and humans assessed images of people with varying emotional facial expressions and with four different types of coverings, i.e. unmasked, with a mask covering the lower face, a partial mask with transparent mouth window, and with sunglasses. The first contribution of this work is that computer systems were found to be better classifiers (98.48%) than humans (82.72%) for faces without covering (>15% difference). This difference is due to the significantly lower accuracy in categorizing anger, disgust, and fear expressions by humans (p ′ s < .001). However, the most novel aspect of the work is identifying how soft computing systems make different mistakes 2 to humans on the same data. Humans mainly confuse unclear expressions as neutral emotion, which minimizes affective effects. Conversely, soft techniques often confuse unclear expressions as other emotion categories, which could lead to opposing decisions being made, e.g. a robot categorizing a fearful user as happy. Importantly, the variation in the misclassification can be adjusted by variations in the balance of categories in the training set.
10 Dec 2023Submitted to TechRxiv
13 Dec 2023Published in TechRxiv