Analyzing Language Patterns for Depression Detection on Social Media: Insights from Reddit Data and Machine Learning Techniques

Madan Mohan Tito Ayyalasomayajula

doi:10.36227/techrxiv.171340759.91176164/v1

loading page

Analyzing Language Patterns for Depression Detection on Social Media: Insights from Reddit Data and Machine Learning Techniques

Madan Mohan Tito Ayyalasomayajula

Abstract

Language can disclose a lot about a person's emotional condition, social standing, and even personality. In the proposed research, the authors looked at the words that are most helpful in determining if a person is depressed or not. Additionally, the authors tried to include a variety of dataset features to see if they could help us form a more accurate judgement. The authors obtained the data from the social networking site Reddit. It includes erratic messages and comments made by a group of depressed people. Given that it performed the best, the authors used logistic regression using the Term Frequency-Inverse Document Frequency (TFIDF) for the proposed model. One of the factors that contributed to a marginal speed improvement was the average time between two consecutive postings or comments. In F1-score, the proposed model fared better than other models utilising the same dataset. The diagnosis of mental disease is turning into a significant issue as people's understanding of the value of mental health grows. Many psychiatrists found it challenging to make a diagnosis of mental disease in a patient due to the complexity of each mental disorder, making it impossible to start the patient on the right course of treatment before it was too late. However, the significance of integrating into people's daily life has produced a setting in which further knowledge about a patient's mental illness may be acquired.

15 Apr 2024Submitted to TechRxiv

18 Apr 2024Published in TechRxiv

Abstract

Peer review timeline