EconPapers    
Economics at your fingertips  
 

Beyond Quality: Predicting Citation Impact in Business Research Using Data Science

Reyner Pérez-Campdesuñer (), Alexander Sánchez-Rodríguez (), Rodobaldo Martínez-Vivar, Margarita De Miguel-Guzmán and Gelmar García-Vidal
Additional contact information
Reyner Pérez-Campdesuñer: Faculty of Law, Administrative and Social Sciences, Universidad UTE, Quito 170527, Ecuador
Alexander Sánchez-Rodríguez: Faculty of Engineering Sciences and Industries, Universidad UTE, Quito 170527, Ecuador
Rodobaldo Martínez-Vivar: Faculty of Law, Administrative and Social Sciences, Universidad UTE, Quito 170527, Ecuador
Margarita De Miguel-Guzmán: Departament of Administration, Instituto Superior Tecnológico Atlantic, Santo Domingo 230201, Ecuador
Gelmar García-Vidal: Faculty of Law, Administrative and Social Sciences, Universidad UTE, Quito 170527, Ecuador

Publications, 2025, vol. 13, issue 3, 1-18

Abstract: The volume of scientific publications has increased exponentially over the past decades across virtually all academic disciplines. In this landscape of information overload, objective criteria are needed to identify high-impact research. Citation counts have traditionally served as a primary indicator of scientific relevance; however, questions remain as to whether they truly reflect the intrinsic quality of a publication. This study investigates the relationship between citation frequency and a wide range of editorial, authorship, and contextual variables. A dataset of 339,609 articles indexed in Scopus was analyzed, retrieved using the search query TITLE-ABS-KEY (management) AND LIMIT-TO (subarea, “Busi”). The research employed a descriptive analysis followed by two predictive modeling approaches: a Random Forest algorithm to assess variable importance, and a binary logistic regression to estimate the probability of a paper being cited. Results indicate that factors such as journal quartile, country of affiliation, number of authors, open access availability, and keyword usage significantly influence citation outcomes. The Random Forest model explained 94.9% of the variance, while the logistic model achieved an AUC of 0.669, allowing the formulation of a predictive citation equation. Findings suggest that multiple determinants beyond content quality drive citation behavior, and that citation probability can be predicted with reasonable accuracy, though inherent model limitations must be acknowledged.

Keywords: citations; random forest models; bibliometric studies; business and management research (search for similar items in EconPapers)
JEL-codes: A2 D83 L82 (search for similar items in EconPapers)
Date: 2025
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2304-6775/13/3/42/pdf (application/pdf)
https://www.mdpi.com/2304-6775/13/3/42/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jpubli:v:13:y:2025:i:3:p:42-:d:1742694

Access Statistics for this article

Publications is currently edited by Ms. Jennifer Zhang

More articles in Publications from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().

 
Page updated 2025-10-04
Handle: RePEc:gam:jpubli:v:13:y:2025:i:3:p:42-:d:1742694