EconPapers    
Economics at your fingertips  
 

READ THE DAMN DOCUMENTATION (CAREFULLY). A case study using the PISA data

John Jerrim (), Maria Palma Caravajal (), Jake Anders (), Maria Ladron de Guevara Rodriguez () and Oscar Marcenaro-Gutierrez ()
Additional contact information
John Jerrim: UCL Social Research Institute
Maria Palma Caravajal: UCL Social Research Institute
Jake Anders: UCL Centre for Education Poicy & Equalising Opportunities
Maria Ladron de Guevara Rodriguez: Departamento de Economia Aplicada, Universidad de Malaga
Oscar Marcenaro-Gutierrez: Departamento de Economia Aplicada, Universidad de Malaga

No 25-14, CEPEO Working Paper Series from UCL Centre for Education Policy and Equalising Opportunities

Abstract: When you get access to a new dataset, do you always carefully read the documentation first? We all know we should. But - let's be honest - it's a lot more fun to just start playing with the data. This can however be a dangerous game to play. This paper presents a case study of this matter using the OECD’s Programme for International Student Assessment (PISA). A survey question included in this study attempts to measure student truancy across countries over time. The international survey documentation suggests an identical question has been used across countries and cycles. Yet the national documentation illustrates how a subtle - yet important - change to the wording was made in some countries in 2015. We demonstrate how researchers could easily miss this change and how this would impact inferences in changes in truancy rates before and after the COVID-19 pandemic. Attempts to use artificial intelligence and large language models to spot this problem resulted in overconfidently incorrect advice. The findings thus serve as a reminder to even the most experienced data analysts (including ourselves) - ALWAYS READ THE SURVEY DOCUMENTATION CAREFULLY.

Keywords: PISA; AI; data documentation; survey methodology; truancy; absences; COVID-19. (search for similar items in EconPapers)
Pages: 28 pages
Date: 2025-11, Revised 2025-11
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://repec-cepeo.ucl.ac.uk/cepeow/cepeowp25-14.pdf Initial version, 2025 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:ucl:cepeow:25-14

Access Statistics for this paper

More papers in CEPEO Working Paper Series from UCL Centre for Education Policy and Equalising Opportunities Contact information at EDIRC.
Bibliographic data for series maintained by Jake Anders ().

 
Page updated 2025-12-18
Handle: RePEc:ucl:cepeow:25-14