A Second Dystopia in Education: Validity Issues in Authentic Assessment Practices

Hathcoat, John D.; Penn, Jeremy D.; Barnes, Laura L. B.; Comer, Johnathan C.

A Second Dystopia in Education: Validity Issues in Authentic Assessment Practices

John D. Hathcoat (), Jeremy D. Penn, Laura L. B. Barnes and Johnathan C. Comer
Additional contact information
John D. Hathcoat: James Madison University
Jeremy D. Penn: North Dakota State University
Laura L. B. Barnes: Oklahoma State University
Johnathan C. Comer: Oklahoma State University

Research in Higher Education, 2016, vol. 57, issue 7, No 5, 892-912

Abstract: Abstract Authentic assessments used in response to accountability demands in higher education face at least two threats to validity. First, a lack of interchangeability between assessment tasks introduces bias when using aggregate-based scores at an institutional level. Second, reliance on written products to capture constructs such as critical thinking (CT) may introduce construct-irrelevant variance if score variance reflects written communication (WC) skill as well as variation in the construct of interest. Two studies investigated these threats to validity. Student written responses to faculty in-class assignments were sampled from general education courses within an institution. Faculty raters trained to use a common rubric than rated the students’ written papers. The first study used hierarchical linear modeling to estimate the magnitude of between-assignment variance in CT scores among 343 student-written papers nested within 18 assignments. About 18 % of the total CT variance was attributed to differences in average CT scores indicating that assignments were not interchangeable. Approximately 47 % of this between-assignment variance was predicted by the extent to which the assignments requested students to demonstrate their own perspective. Thus aggregating CT scores across students and assignments could bias the scores up or down depending on the characteristics of the assignments, particularly perspective-taking. The second study used exploratory factor analysis and squared partial correlations to estimate the magnitude of construct-irrelevant variance in CT scores. Student papers were rated for CT by one group of faculty and for WC by a different group of faculty. Nearly 25 % of the variance in CT scores was attributed to differences in WC scores. Score-based interpretations of CT may need to be delimited if observations are solely obtained through written products. Both studies imply a need to gather additional validity evidence in authentic assessment practices before this strategy is widely adopted among institutions of higher education. Authors also address misconceptions about standardization in authentic assessment practices.

Keywords: Critical thinking; Writing; Performance assessment; Authentic assessment; Validity; Standardization; Task-specificity; Higher education (search for similar items in EconPapers)
Date: 2016
References: View complete reference list from CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
http://link.springer.com/10.1007/s11162-016-9407-1 Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:reihed:v:57:y:2016:i:7:d:10.1007_s11162-016-9407-1

Ordering information: This journal article can be ordered from
http://www.springer.com/journal/11162

DOI: 10.1007/s11162-016-9407-1

Access Statistics for this article

Research in Higher Education is currently edited by Robert K. Toutkoushian

More articles in Research in Higher Education from Springer, Association for Institutional Research
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().