Are the confidence scores of reviewers consistent with the review content? Evidence from top conference proceedings in AI
Wenqing Wu (),
Haixu Xi () and
Chengzhi Zhang ()
Additional contact information
Wenqing Wu: Nanjing University of Science and Technology
Haixu Xi: Nanjing University of Science and Technology
Chengzhi Zhang: Nanjing University of Science and Technology
Scientometrics, 2024, vol. 129, issue 7, No 17, 4109-4135
Abstract:
Abstract Peer review is a critical process used in academia to assess the quality and validity of research articles. Top-tier conferences in the field of artificial intelligence (e.g. ICLR and ACL et al.) require reviewers to provide confidence scores to ensure the reliability of their review reports. However, existing studies on confidence scores have neglected to measure the consistency between the comment text and the confidence score in a more refined way, which may overlook more detailed details (such as aspects) in the text, leading to incomplete understanding of the results and insufficient objective analysis of the results. In this work, we propose assessing the consistency between the textual content of the review reports and the assigned scores at a fine-grained level, including word, sentence and aspect levels. The data used in this paper is derived from the peer review comments of conferences in the fields of deep learning and natural language processing. We employed deep learning models to detect hedge sentences and their corresponding aspects. Furthermore, we conducted statistical analyses of the length of review reports, frequency of hedge word usage, number of hedge sentences, frequency of aspect mentions, and their associated sentiment to assess the consistency between the textual content and confidence scores. Finally, we performed correlation analysis, significance tests and regression analysis on the data to examine the impact of confidence scores on the outcomes of the papers. The results indicate that textual content of the review reports and their confidence scores have high level of consistency at the word, sentence, and aspect levels. The regression results reveal a negative correlation between confidence scores and paper outcomes, indicating that higher confidence scores given by reviewers were associated with paper rejection. This indicates that current overall assessment of the paper’s content and quality by the experts is reliable, making the transparency and fairness of the peer review process convincing. We release our data and associated codes at https://github.com/njust-winchy/confidence_score .
Keywords: Peer review; Confidence score; Paper decision; Hedge sentences detection; Consistent analysis (search for similar items in EconPapers)
Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
http://link.springer.com/10.1007/s11192-024-05070-8 Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:scient:v:129:y:2024:i:7:d:10.1007_s11192-024-05070-8
Ordering information: This journal article can be ordered from
http://www.springer.com/economics/journal/11192
DOI: 10.1007/s11192-024-05070-8
Access Statistics for this article
Scientometrics is currently edited by Wolfgang Glänzel
More articles in Scientometrics from Springer, Akadémiai Kiadó
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().