Can Machines Think Like Humans? A Behavioral Evaluation of LLM-Agents in Dictator Games
Ji Ma
Papers from arXiv.org
Abstract:
As Large Language Model (LLM)-based agents increasingly undertake real-world tasks and engage with human society, how well do we understand their behaviors? We (1) investigate how LLM agents' prosocial behaviors -- a fundamental social norm -- can be induced by different personas and benchmarked against human behaviors; and (2) introduce a behavioral and social science approach to evaluate LLM agents' decision-making. We explored how different personas and experimental framings affect these AI agents' altruistic behavior in dictator games and compared their behaviors within the same LLM family, across various families, and with human behaviors. The findings reveal substantial variations and inconsistencies among LLMs and notable differences compared to human behaviors. Merely assigning a human-like identity to LLMs does not produce human-like behaviors. Despite being trained on extensive human-generated data, these AI agents are unable to capture the internal processes of human decision-making. Their alignment with human is highly variable and dependent on specific model architectures and prompt formulations; even worse, such dependence does not follow a clear pattern. LLMs can be useful task-specific tools but are not yet intelligent human-like agents.
Date: 2024-10, Revised 2024-12
New Economics Papers: this item is included in nep-ain, nep-big, nep-cmp, nep-evo and nep-exp
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
http://arxiv.org/pdf/2410.21359 Latest version (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:arx:papers:2410.21359
Access Statistics for this paper
More papers in Papers from arXiv.org
Bibliographic data for series maintained by arXiv administrators ().