Delegation to Artificial Intelligence can increase dishonest behaviour
Nils Köbis,
Zoe Rahwan,
Raluca Rilla,
Bramantyo Ibrahim Supriyatno,
Clara Bersch,
Tamer Ajaj,
Jean-François Bonnefon () and
Iyad Rahwan ()
Additional contact information
Nils Köbis: Universität Duisburg-Essen = University of Duisburg-Essen [Essen], Max Planck Institute for Human Development - Max-Planck-Gesellschaft
Zoe Rahwan: Max Planck Institute for Human Development - Max-Planck-Gesellschaft
Raluca Rilla: Max Planck Institute for Human Development - Max-Planck-Gesellschaft
Bramantyo Ibrahim Supriyatno: Max Planck Institute for Human Development - Max-Planck-Gesellschaft
Clara Bersch: Max Planck Institute for Human Development - Max-Planck-Gesellschaft
Tamer Ajaj: Max Planck Institute for Human Development - Max-Planck-Gesellschaft
Jean-François Bonnefon: TSE-R - Toulouse School of Economics - UT Capitole - Université Toulouse Capitole - Comue de Toulouse - Communauté d'universités et établissements de Toulouse - EHESS - École des hautes études en sciences sociales - CNRS - Centre National de la Recherche Scientifique - INRAE - Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement
Iyad Rahwan: Max Planck Institute for Human Development - Max-Planck-Gesellschaft
Working Papers from HAL
Abstract:
While Artificial Intelligence enables productivity gains from delegating tasks to machines, it may facilitate the delegation of unethical behaviour. This risk is highly relevant amid the rapid rise of 'agentic' AI systems. Here we demonstrate this risk by having human principals instruct machine agents to perform tasks with incentives to cheat. Requests for cheating increased when principals could induce machine dishonesty without telling the machine precisely what to do, through supervised learning or high-level goal-setting. These effects held whether delegation was voluntary or mandatory. We also examined delegation via natural language to Large Language Models. While principals' cheating requests were not always higher for machine agents, compliance diverged sharply: Machines were far more likely than human agents to carry out fully unethical instructions. This compliance could be curbed, but usually not eliminated, with the injection of prohibitive, task-specific guardrails. Our results highlight ethical risks in the context of increasingly accessible and powerful machine delegation, and suggest design and policy strategies to mitigate them.
Date: 2025-09
Note: View the original document on HAL open archive server: https://hal.science/hal-05273501v1
References: Add references at CitEc
Citations:
Downloads: (external link)
https://hal.science/hal-05273501v1/document (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:hal:wpaper:hal-05273501
Access Statistics for this paper
More papers in Working Papers from HAL
Bibliographic data for series maintained by CCSD ().