Generative AI without guardrails can harm learning: Evidence from high school mathematics
Hamsa Bastani,
Osbert Bastani,
Alp Sungu (),
Haosen Ge,
Özge Kabakcı and
Rei Mariman
Additional contact information
Hamsa Bastani: b Wharton AI & Analytics , Philadelphia , PA 19104
Osbert Bastani: c Department of Operations, Information, and Decisions , School of Engineering and Applied Science , University of Pennsylvania , Philadelphia , PA 19104
Alp Sungu: a Department of Operations, Information, and Decisions , Wharton School , University of Pennsylvania , Philadelphia , PA 19104
Haosen Ge: b Wharton AI & Analytics , Philadelphia , PA 19104
Özge Kabakcı: d Department of Mathematics , Budapest British International School , Budapest 1125 , Hungary
Rei Mariman: e Independent , Philadelphia , PA 19104
Proceedings of the National Academy of Sciences, 2025, vol. 122, issue 26, e2422633122
Abstract:
Generative AI is poised to revolutionize how humans work, and has already demonstrated promise in significantly improving human productivity. A key question is how generative AI affects learning—namely, how humans acquire new skills as they perform tasks. Learning is critical to long-term productivity, especially since generative AI is fallible and users must check its outputs. We study this question via a field experiment where we provide nearly a thousand high school math students with access to generative AI tutors. To understand the differential impact of tool design on learning, we deploy two generative AI tutors: one that mimics a standard ChatGPT interface (“GPT Base†) and one with prompts designed to safeguard learning (“GPT Tutor†). Consistent with prior work, our results show that having GPT-4 access while solving problems significantly improves performance (48% improvement in grades for GPT Base and 127% for GPT Tutor). However, we additionally find that when access is subsequently taken away, students actually perform worse than those who never had access (17% reduction in grades for GPT Base)—i.e., unfettered access to GPT-4 can harm educational outcomes. These negative learning effects are largely mitigated by the safeguards in GPT Tutor. Without guardrails, students attempt to use GPT-4 as a “crutch†during practice problem sessions, and subsequently perform worse on their own. Thus, decision-makers must be cautious about design choices underlying generative AI deployments to preserve skill learning and long-term productivity.
Keywords: generative AI; education; skill acquisition; personalized tutoring (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
https://doi.org/10.1073/pnas.2422633122 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:nas:journl:v:122:y:2025:p:e2422633122
Access Statistics for this article
More articles in Proceedings of the National Academy of Sciences from Proceedings of the National Academy of Sciences
Bibliographic data for series maintained by PNAS Product Team ().