EconPapers    
Economics at your fingertips  
 

Comparing representations of a discipline derived through LDA vs. intellectual content analysis: the case of information science

Kaisa Ylikruuvi, Kalervo Järvelin (), Pertti Vakkari and Martti Juhola
Additional contact information
Kaisa Ylikruuvi: Tampere University
Kalervo Järvelin: Tampere University
Pertti Vakkari: Tampere University
Martti Juhola: Tampere University

Scientometrics, 2025, vol. 130, issue 8, No 6, 4309-4337

Abstract: Abstract The paper looks at the methodology of empirical analyses of the content and structure of Information Science (IS). The traditional approach in empirical analysis is intellectual content analysis (ICA) of a representative data set. The high labor cost prohibits the analysis of massive data sets. A recent alternative is based on data mining/machine learning. Its strength is the capability of analyzing massive datasets efficiently. However, a significant issue is the quality of content analysis. The paper compares latent Dirichlet allocation/topic modeling (LDA/TM) based statistical analysis to ICA using the same data set, 1514 scholarly articles from the year 2015 volumes of 30 IS journals. The intellectual analysis provides the mirror for reflecting the TM results. LDA/TM is strong in identifying new directions of a discipline and processing masses of text. Its weaknesses include semantic haziness of topics due to bag-of-words article representation, text pre-processing, tuning of parameters, and being unanalytic in composing topics from words belonging to different categories.

Keywords: Information science; Content analysis; Latent Dirichlet allocation; Comparative study (search for similar items in EconPapers)
Date: 2025
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
http://link.springer.com/10.1007/s11192-025-05376-1 Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:scient:v:130:y:2025:i:8:d:10.1007_s11192-025-05376-1

Ordering information: This journal article can be ordered from
http://www.springer.com/economics/journal/11192

DOI: 10.1007/s11192-025-05376-1

Access Statistics for this article

Scientometrics is currently edited by Wolfgang Glänzel

More articles in Scientometrics from Springer, Akadémiai Kiadó
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-10-11
Handle: RePEc:spr:scient:v:130:y:2025:i:8:d:10.1007_s11192-025-05376-1