EconPapers    
Economics at your fingertips  
 

Measuring Test Measurement Error

Donald Boyd, Hamilton Lankford, Susanna Loeb and James Wyckoff

Journal of Educational and Behavioral Statistics, 2013, vol. 38, issue 6, 629-663

Abstract: Test-based accountability as well as value-added asessments and much experimental and quasi-experimental research in education rely on achievement tests to measure student skills and knowledge. Yet, we know little regarding fundamental properties of these tests, an important example being the extent of measurement error and its implications for educational policy and practice. While test vendors provide estimates of split-test reliability, these measures do not account for potentially important day-to-day differences in student performance. In this article, we demonstrate a credible, low-cost approach for estimating the overall extent of measurement error that can be applied when students take three or more tests in the subject of interest (e.g., state assessments in consecutive grades). Our method generalizes the test–retest framework by allowing for (a) growth or decay in knowledge and skills between tests, (b) tests being neither parallel nor vertically scaled, and (c) the degree of measurement error varying across tests. The approach maintains relatively unrestrictive, testable assumptions regarding the structure of student achievement growth. Estimation only requires descriptive statistics (e.g., test-score correlations). With student-level data, the extent and pattern of measurement-error heteroscedasticity also can be estimated. In turn, one can compute Bayesian posterior means of achievement and achievement gains given observed scores—estimators having statistical properties superior to those for the observed score (score gain). We employ math and English language arts test-score data from New York City to demonstrate these methods and estimate the overall extent of test measurement error is at least twice as large as that reported by the test vendor.

Keywords: generalizability theory; reliability; testing; high-stakes testing; correlational analysis; longitudinal studies; effect size (search for similar items in EconPapers)
Date: 2013
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (6)

Downloads: (external link)
https://journals.sagepub.com/doi/10.3102/1076998613508584 (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:sae:jedbes:v:38:y:2013:i:6:p:629-663

DOI: 10.3102/1076998613508584

Access Statistics for this article

More articles in Journal of Educational and Behavioral Statistics
Bibliographic data for series maintained by SAGE Publications ().

 
Page updated 2025-03-19
Handle: RePEc:sae:jedbes:v:38:y:2013:i:6:p:629-663