loading page

A Comparative Analysis of Sentiment Classification Models for Improved Performance Optimization
  • Varun Iyer
Varun Iyer
Center For Fundamental Research and Creative Education

Corresponding Author:[email protected]

Author Profile


The domain of Natural Language Processing has placed a significant onus on AI/ML engineers to formulate and optimize machine learning models for sentiment analysis since its inception. This research aims to contribute a perspective to the question of the accuracy of machine learning models-both simple and complex-in ascertaining sentiment, and to elucidate methods for optimizing their efficacy based on a multitude of metrics. In pursuit of this objective, this study intends to identify the strengths and weaknesses of diverse sentiment analysis models, conduct a comparative analysis of their performance based on accuracy, speed, and efficiency, and evaluate the impact of pre-processing techniques such as data cleaning, feature selection, and dimensionality reduction on sentiment analysis performance. Additionally, the research endeavours to present recommendations for selecting the most suitable sentiment analysis model in consideration of specific applications and data requirements. The research methodology entails the use of a vast range of models, including Decision Tree, KNN, Logistic Regression, Naive Bayes, Perceptron, Random Forest, SVM, LSTM, and Bi-LSTM. The models' performance is measured through various metrics, such as Confusion Matrix, Accuracy, Precision, Recall, and F1, to ensure a comprehensive assessment of their performance characteristics. The study further employs an array of preprocessing techniques, including Contraction Handling, Stopwords, Lemmatization, and Negation Handling, as well as feature extraction methods such as Bag of Words and TFIDF. These methodologies aim to provide a comprehensive and thorough evaluation of the various approaches to sentiment analysis.
10 Mar 2024Submitted to TechRxiv
18 Mar 2024Published in TechRxiv