EconPapers    
Economics at your fingertips  
 

Comparing Human-Only, AI-Assisted, and AI-Led Teams on Assessing Research Reproducibility in Quantitative Social Science

Abel Brodeur, David Valenta, Alexandru Marcoci, Juan P. Aparicio, Derek Mikola, Bruno Barbarioli, Rohan Alexander, Lachlan Deer, Tom Stafford, Lars Vilhuber, Gunther Bensch, Dirk Goldschmitt, Amélie Gourdon-Kanhukamwe, Andrea Gregor de Varda, Idaliya Grigoryeva, Alexi Gugushvili, Aaron H.A. Fletcher, Florian Habermann, Márton Hablicsek, Joanne Haddad, Jonathan Hall, Mohamed Abdelhady, Olle Hammar (), Malek Hassouneh, Carina I. Hausladen, Sophie C. F. Hendrikse, Matthew Hepplewhite, Anson T. Y. Ho, Senan Hogan-Hennessy, Elliot Howley, Gaoyang Huang, Héloïse Hulstaert, Yousra Abdelmoula, Zlatomira G. Ilchovska, Niklas Jakobsson, Joakim Jansson, Ewa Jarosz, Hossein Jebeli, Yanchen Jiang, Hiba Junaid, Rohan Kalluraya, Edmund Kelly, Eva Kimel, Ghina Abdul Baki, Sorravich Kingsuwankul, Valentin Klotzbücher, Daniel Krähmer, Pijus Krūminas, Nicholas Kruus, Essi Kujansuu, Christoph F. Kurz, Stephan Küster, Blake Lee-Whiting, Felix Lewandowski, Tomás Aguirre, Tongzhe Li, Ruoxi Li, Dan Liu, Jiacheng Liu, Helix Lo, Katharina Loter, Felipe Macedo Dias, Christopher R. Madan, Nicolas Mäder, Marco Mandas, Sriraj Aiyer, Jan Marcus, Diego Marino Fages, Xavier Martin, Ryan McWay, Daniel Medina-Gaspar, Sisi Meng, Lingyu Meng, Alex P. Miller, Thibault Mirabel, Dibya Deepta Mishra, Shumi Akhtar, Sumit Mishra, Belay W. Moges, Morteza Mohandes Mojarrad, Myra Mohnen, Louis-Philippe Morin, Fabio Motoki, Lucija Muehlenbachs, Gastón Mullin, Andreea Musulan, Sara Muzzì, Farida Akhtar, James A. C. Myers, Florian Neubauer, Tuan Nguyen, Ali Niazi, Ardyn Nordstrom, Bartłomiej Nowak, O’Habib, Daneal, Tim Ölkers, Justin Ong, Valeria Orozco Castiblanco, Melle R. Albada, Ömer Özak, Ali Ozkes, Mikael Paaso, Shubham Pandey, Varvara Papazoglou, Romeo Penheiro, Linh Pham, Ulrike Phieler, Peter Pütz, Quan Qi, Micah Altman, Jingyi Qiu, David A. Reinstein, Juuso Repo, Nicolas Rudolf, Shree Saha, Orkun Saka, Chiara Saponaro, Georg Sator, Martijn Schoenmakers, Raffaello Seri, David Angenendt, Meet Shah, Paul Sibille, Christoph Siemroth, Vladimir Skavysh, Ben Slater, Wenting Song, Stefan Staubli, Tobias Steindl, Nomwendé Steven Waongo, Paul Stott, Zahra Arjmandi Lari, Stephenson Strobel, Roshini Sudhaharan, Pu Sun, Scott D. Swain, Oleksandr Talavera, Hanz M. Tantiangco, Georgy Tarasenko, Boyd Tarlinton, Mariam Tarraf, Ken Teoh, Jorge Armando De León Tejada, Rémi Thériault, Bethan Thompson, Tonghui Tian, Wenjie Tian, Manuel Tobias Rein, Emmanuel Tolani, Nicolai Topstad Borgen, Solveig Topstad Borgen, Javier Torralba, Carolina Velez-Ospina, Igor Asanov, Man Wai Mak, Lukas Wallrich, Zeyang Wang, Leah Ward, Matthew D. Webb, Duncan Webb, Bryan Weber, Christoph Weber, Wei-Chien Weng, Christian Westheide, Anastasiya-Mariya Asanov Noha, Tom Wilkinson, Kwong-Yu Wong, Marcin Wroński, Zhuangchen Wu, Qixia Wu, Victor Y. Wu, Bohan Xiao, Feihong Xu, Cong Xu, Pranav Yadav, Rebecca Ashong, Yu Yang Chou, Luther Yap, Myra Yazbeck, Zuzanna Zagrodzka, Tahreen Zahra, Mirela Zaneva, Xiaomeng Zhang, Ziwei Zhao, Han Zhong, Aras Zirgulis, Tobias Auer, Jiacheng Zou, Floris Zoutman, Christelle Zozoungbo, Francisco J. Bahamonde-Birke, Bradley J. Baker, Söhnke Bartram, Dongqi Bao, Lucija Batinovic, Tommaso Batistoni, Monica Beeder, Louis-Philippe Beland, Carsten Bienz, Christ Billy Aryanto, Cylcia Bolibaugh, Carl Bonander, Ramiro Bravo, Katherine Brennan, Egor Bronnikov, Stephan Bruns, Nino Buliskeria, Sara Caicedo-Silva, Andrea Calef, Solomon Caulker, Simonas Cepenas, Arthur Chatton, Zirou Chen, Ngozi Chioma Ewurum, Anda-Bianca Ciocîrlan, Felix J. Clouth, Jason Collins, Nikolai Cook, Cesar Cornejo, João Craveiro, Jing Cui, Niveditha Chalil Vayalabron, Christian Czymara, Carlos Daniel Bermúdez Jaramillo, Hannes Datta, Lien Denoo, Arshia Dhaliwal, Nency Dhameja, Elodie Djemai, Erwan Dujeancourt, Uğurcan Dündar, Thibaut Duprey, Yasmine Eissa, Youssef El Fassi, Ismail El Fassi, Keaton Ellis, Ali Elminejad, Mahmoud Elsherif, Aysil Emirmahmutoglu, Giulian Etingin-Frati, Emeka Eze, Jan Fabian Dollbaum, Jan Feld Victoria, Andres Felipe Rengifo Jaramillo, Guidon Fenig, Victoria Fernandes, Lenka Fiala, Lukas Fink, Sara Fish, Jack Fitzgerald, Rachel Forshaw, Alexandre Fortier-Chouinard, Louis Fréget, Joris Frese, Jacopo Gabani, Sebastian Gallegos, Max C. Gamill, Attila Gáspár, Romain Gauriot, Evelina Gavrilova, Diogo Geraldes, Giulio Giacomo Cantone and Grant Gibson

No 195, I4R Discussion Paper Series from The Institute for Replication (I4R)

Abstract: This study evaluates the effectiveness of varying levels of human and artificial intelligence (AI) integration in reproducibility assessments of quantitative social science research. We computationally reproduced quantitative results from published articles in the social sciences with 288 researchers, randomly assigned to 103 teams across three groups - human-only teams, AI-assisted teams and teams whose task was to minimally guide an AI to conduct reproducibility checks (the "AI-led" approach). Findings reveal that when working independently, human teams matched the reproducibility success rates of teams using AI assistance, while both groups substantially outperformed AI-led approaches (with human teams achieving 57 percentage points higher success rates than AI-led teams, 𝒑 ﹤ 0.001). Human teams were particularly effective at identifying serious problems in the analysis: they found significantly more major errors compared to both AI-assisted teams (0.7 more errors per team, 𝒑 = 0.017) and AI-led teams (1.1 more errors per team, 𝒑 ﹤ 0.001). AI-assisted teams demonstrated an advantage over more automated approaches, detecting 0.4 more major errors per team than AI-led teams ( 𝒑 = 0.029), though still significantly fewer than human-only teams. Finally, both human and AI-assisted teams significantly outperformed AIled approaches in both proposing (25 percentage points difference, 𝒑 = 0.017) and implementing (33 percentage points difference, 𝒑 = 0.005) comprehensive robustness checks. These results underscore both the strengths and limitations of AI assistance in research reproduction and suggest that despite impressive advancements in AI capability, key aspects of the research publication process still require human substantial human involvement.

Date: 2025
New Economics Papers: this item is included in nep-ain
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.econstor.eu/bitstream/10419/308508/1/I4R-DP195.pdf (application/pdf)

Related works:
Working Paper: Comparing Human-Only, AI-Assisted, and AI-Led Teams on Assessing Research Reproducibility in Quantitative Social Science (2025) Downloads
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:zbw:i4rdps:195

Access Statistics for this paper

More papers in I4R Discussion Paper Series from The Institute for Replication (I4R)
Bibliographic data for series maintained by ZBW - Leibniz Information Centre for Economics ().

 
Page updated 2025-04-10
Handle: RePEc:zbw:i4rdps:195