EconPapers    
Economics at your fingertips  
 

A large-scale semi-automated approach for assessing document-type classification errors in bibliometric databases

D. A. Maisano, L. Mastrogiacomo, L. Ferrara and F. Franceschini ()
Additional contact information
D. A. Maisano: Politecnico Di Torino
L. Mastrogiacomo: Politecnico Di Torino
L. Ferrara: Politecnico Di Torino
F. Franceschini: Politecnico Di Torino

Scientometrics, 2025, vol. 130, issue 3, No 22, 1938 pages

Abstract: Abstract The accuracy of bibliometric databases in classifying document types (DTs)—such as research articles, conference proceedings, reviews, short notes, letters, book chapters, etc.—is crucial for the academic community, as bibliometric indicators may significantly influence research funding, decision-making, and academic reputation. This study presents a semi-automated methodology to assess the accuracy of DT classification in bibliometric databases, such as Scopus and Web of Science (WoS). The methodology can handle large document volumes and adapt to different DT categories without predefined correspondences. The first phase of the methodology automatically identifies discrepancies in DT classifications between Scopus and WoS, in order to find potentially misclassified documents; the second phase involves manually analyzing these documents to confirm and attribute classification errors. The methodology is applied to a sample of several tens of thousands of papers from the teaching staff of two major universities in Turin (Italy). The results show overall error rates of approximately 2.7% for Scopus and 2.3% for WoS. The paper also analyzes the most common types of errors found in both databases, providing an interpretation of these inaccuracies and some insights for possible improvements in the quality of these databases.

Keywords: Bibliometric database; Document-type classification; Semi-automated methodology; Database accuracy; Misclassification; Scopus; Web of Science (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:

Downloads: (external link)
http://link.springer.com/10.1007/s11192-025-05244-y Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:scient:v:130:y:2025:i:3:d:10.1007_s11192-025-05244-y

Ordering information: This journal article can be ordered from
http://www.springer.com/economics/journal/11192

DOI: 10.1007/s11192-025-05244-y

Access Statistics for this article

Scientometrics is currently edited by Wolfgang Glänzel

More articles in Scientometrics from Springer, Akadémiai Kiadó
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-04-11
Handle: RePEc:spr:scient:v:130:y:2025:i:3:d:10.1007_s11192-025-05244-y