EconPapers    
Economics at your fingertips  
 

MAD Chairs: A new tool to evaluate AI

Chris Santos-Lang and Christopher M. Homan

Papers from arXiv.org

Abstract: This paper presents a new contribution to the problem of AI evaluation. Much as one might evaluate a machine in terms of its performance at chess, this approach involves evaluating a machine in terms of its performance at a game called "MAD Chairs." At the time of writing, evaluation with this game exposed opportunities to improve Claude, Gemini, ChatGPT, Qwen and DeepSeek. Furthermore, this paper sets a stage for future innovation in game theory and AI safety by providing an example of success with non-standard approaches to each: studying a game beyond the scope of previous game theoretic tools and mitigating a serious AI safety risk in a way that requires neither determination of values nor their enforcement.

Date: 2025-03, Revised 2025-03
References: Add references at CitEc
Citations:

Downloads: (external link)
http://arxiv.org/pdf/2503.20986 Latest version (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:arx:papers:2503.20986

Access Statistics for this paper

More papers in Papers from arXiv.org
Bibliographic data for series maintained by arXiv administrators ().

 
Page updated 2025-04-01
Handle: RePEc:arx:papers:2503.20986