Detecting toxic comments on social media: an extensive evaluation of machine learning techniques

Patel, Dharil; Pramanik, Pijush Kanti Dutta; Suryawanshi, Chaitanya; Pareek, Preksha

Detecting toxic comments on social media: an extensive evaluation of machine learning techniques

Dharil Patel (), Pijush Kanti Dutta Pramanik (), Chaitanya Suryawanshi () and Preksha Pareek ()
Additional contact information
Dharil Patel: Symbiosis Institute of Technology
Pijush Kanti Dutta Pramanik: Galgotias University
Chaitanya Suryawanshi: Symbiosis Institute of Technology
Preksha Pareek: Thakur College of Engineering and Technology

Journal of Computational Social Science, 2025, vol. 8, issue 1, No 20, 18 pages

Abstract: Abstract The prevalence of toxic comments on social networking sites poses a significant threat to the freedom of speech and the psychological well-being of online users. To address this challenge, researchers have turned to machine learning algorithms as a means of categorizing and identifying toxic contents. This study presents a comprehensive comparison of multiple machine learning techniques for predicting toxic posts on a social media platform. The Jigsaw toxic comment classification dataset was used to test the performance of nine different machine learning models. Various evaluation metrics, including accuracy, precision, recall, and F1-score, were employed to assess the models' effectiveness. Additionally, hyperparameter tuning was performed for each algorithm, and the outcomes were compared to determine the optimal technique, while examining the effects of hyperparameter variations. The results demonstrate that the naive Bayes classifier is the most accurate among the proposed models, achieving an accuracy of 97.30% and a run-time complexity of 0.06. The second-highest accuracy score of 97.31% was recorded for the XGBoost algorithm, with a run-time complexity of 41.06. The findings of this study have important implications for the development of efficient online hate speech identification systems. By leveraging the insights gained from this comparative analysis, researchers and practitioners can design more effective strategies for managing and mitigating the prevalence of toxic comments in online communities, ultimately fostering a safer and more inclusive digital environment.

Keywords: Toxic comments; Natural language processing; Text classification; Social media; Machine learning; Comparative analysis (search for similar items in EconPapers)
Date: 2025
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
http://link.springer.com/10.1007/s42001-024-00349-5 Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:jcsosc:v:8:y:2025:i:1:d:10.1007_s42001-024-00349-5

Ordering information: This journal article can be ordered from
http://www.springer. ... iences/journal/42001

DOI: 10.1007/s42001-024-00349-5

Access Statistics for this article

Journal of Computational Social Science is currently edited by Takashi Kamihigashi

More articles in Journal of Computational Social Science from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().