Inter-rater reliability and validity of peer reviews in an interdisciplinary field

Jirschitzka, Jens; Oeberst, Aileen; Göllner, Richard; Cress, Ulrike

Inter-rater reliability and validity of peer reviews in an interdisciplinary field

Jens Jirschitzka, Aileen Oeberst (), Richard Göllner and Ulrike Cress
Additional contact information
Jens Jirschitzka: Eberhard Karls Universität Tübingen
Aileen Oeberst: Johannes Gutenberg-Universität Mainz
Richard Göllner: Eberhard Karls Universität Tübingen
Ulrike Cress: Eberhard Karls Universität Tübingen

Scientometrics, 2017, vol. 113, issue 2, No 20, 1059-1092

Abstract: Abstract Peer review is an integral part of science. Devised to ensure and enhance the quality of scientific work, it is a crucial step that influences the publication of papers, the provision of grants and, as a consequence, the career of scientists. In order to meet the challenges of this responsibility, a certain shared understanding of scientific quality seems necessary. Yet previous studies have shown that inter-rater reliability in peer reviews is relatively low. However, most of these studies did not take ill-structured measurement design of the data into account. Moreover, no prior (quantitative) study has analyzed inter-rater reliability in an interdisciplinary field. And finally, issues of validity have hardly ever been addressed. Therefore, the three major research goals of this paper are (1) to analyze inter-rater agreement of different rating dimensions (e.g., relevance and soundness) in an interdisciplinary field, (2) to account for ill-structured designs by applying state-of-the-art methods, and (3) to examine the construct and criterion validity of reviewers’ evaluations. A total of 443 reviews were analyzed. These reviews were provided by m = 130 reviewers for n = 145 submissions to an interdisciplinary conference. Our findings demonstrate the urgent need for improvement of scientific peer review. Inter-rater reliability was rather poor and there were no significant differences between evaluations from reviewers of the same scientific discipline as the papers they were reviewing versus reviewer evaluations of papers from disciplines other than their own. These findings extend beyond those of prior research. Furthermore, convergent and discriminant construct validity of the rating dimensions were low as well. Nevertheless, a multidimensional model yielded a better fit than a unidimensional model. Our study also shows that the citation rate of accepted papers was positively associated with the relevance ratings made by reviewers from the same discipline as the paper they were reviewing. In addition, high novelty ratings from same-discipline reviewers were negatively associated with citation rate.

Keywords: Peer review; Inter-rater reliability; Construct validity; Criterion validity; Interdisciplinary research; Citation rate (search for similar items in EconPapers)
Date: 2017
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (5)

Downloads: (external link)
http://link.springer.com/10.1007/s11192-017-2516-6 Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:scient:v:113:y:2017:i:2:d:10.1007_s11192-017-2516-6

Ordering information: This journal article can be ordered from
http://www.springer.com/economics/journal/11192

DOI: 10.1007/s11192-017-2516-6

Access Statistics for this article

Scientometrics is currently edited by Wolfgang Glänzel

More articles in Scientometrics from Springer, Akadémiai Kiadó
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().