Distinguishing Chatbot from Human
Gauri Anil Godghase,
Rishit Agrawal,
Tanush Obili and
Mark Stamp ()
Additional contact information
Gauri Anil Godghase: San Jose State University
Rishit Agrawal: San Jose State University
Tanush Obili: San Jose State University
Mark Stamp: San Jose State University
A chapter in Machine Learning, Deep Learning and AI for Cybersecurity, 2025, pp 529-564 from Springer
Abstract:
Abstract There have been many recent advances in the fields of generative Artificial Intelligence (AI) and Large Language Models (LLM), with the Generative Pre-trained Transformer (GPT) model being a leading “chatbot.” LLM-based chatbots have become so powerful that it may seem difficult to differentiate between human-written and machine-generated text. To analyze this problem, we have developed a new dataset consisting of more than 750,000 human-written paragraphs, with a corresponding chatbot-generated paragraph for each. Based on this dataset, we apply Machine Learning (ML) techniques to determine the origin of text (human or chatbot). Specifically, we consider two methodologies for tackling this issue: feature analysis and embeddings. Our feature analysis approach involves extracting a collection of features from the text for classification. We also explore the use of contextual embeddings and transformer-based architectures to train classification models. Our proposed solutions offer high classification accuracy and serve as useful tools for textual analysis, resulting in a better understanding of chatbot-generated text in this era of advanced AI technology.
Date: 2025
References: Add references at CitEc
Citations:
There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:sprchp:978-3-031-83157-7_19
Ordering information: This item can be ordered from
http://www.springer.com/9783031831577
DOI: 10.1007/978-3-031-83157-7_19
Access Statistics for this chapter
More chapters in Springer Books from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().