Manuscript_Graney-Ward et al.pdf (7.09 MB)
Download fileDetection of Cyberbullying Through BERT and Weighted Ensemble of Classifiers
preprint
posted on 2022-01-05, 20:56 authored by Christopher Graney-Ward, Biju IssacBiju Issac, LIDA KETSBAIALIDA KETSBAIA, Seibu Mary JacobDue to the recent popularity and growth of social media platforms such as Facebook and Twitter, cyberbullying is becoming more and more prevalent. The current research on cyberbullying and the NLP techniques being used to classify this kind of online behaviour was initially studied. This paper discusses the experimentation with combined Twitter datasets by Maryland and Cornell universities using different classification approaches like classical machine learning, RNN, CNN, and pretrained transformer-based classifiers. A state of the art (SOTA) solution was achieved by optimising BERTweet on a Onecycle policy with a Decoupled weight decay optimiser (AdamW), improving the previous F1-score by up to 8.4%, resulting in 64.8% macro F1. Particle Swarm Optimisation was later used to optimise the ensemble model. The ensemble developed from the optimised BERTweet model and a collection of models with varying data representations, outperformed the standalone BERTweet model by 0.53% resulting in 65.33% macro F1 for TweetEval dataset and by 0.55% for combined datasets, resulting in 68.1% macro F1.
History
Email Address of Submitting Author
bissac@ieee.orgORCID of Submitting Author
0000-0002-1109-8715Submitting Author's Institution
Northumbria UniversitySubmitting Author's Country
- United Kingdom