The Effect of Article Characteristics on Citation Number in a Diachronic Dataset of the Biomedical Literature on Chronic Inflammation: An Analysis by Ensemble Machines
Carlo Galli and
Stefano Guizzardi
Additional contact information
Carlo Galli: Department of Medicine and Surgery, Histology and Embryology Lab, University of Parma, Via Volturno 39, 43126 Parma, Italy
Stefano Guizzardi: Department of Medicine and Surgery, Histology and Embryology Lab, University of Parma, Via Volturno 39, 43126 Parma, Italy
Publications, 2021, vol. 9, issue 2, 1-11
Abstract:
Citations are core metrics to gauge the relevance of scientific literature. Identifying features that can predict a high citation count is therefore of primary importance. For the present study, we generated a dataset of 121,640 publications on chronic inflammation from the Scopus database, containing data such as titles, authors, journal, publication date, type of document, type of access and citation count, ranging from 1951 to 2021. Hence we further computed title length, author count, title sentiment score, number of colons, semicolons and question marks in the title and we used these data as predictors in Gradient boosting, Bagging and Random Forest regressors and classifiers. Based on these data, we were able to train these machines, and Gradient Boosting achieved an F1 score of 0.552 on classification. These models agreed that document type, access type and number of authors were the best predicting factors, followed by title length.
Keywords: citations; title; machine learning (search for similar items in EconPapers)
JEL-codes: A2 D83 L82 (search for similar items in EconPapers)
Date: 2021
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2304-6775/9/2/15/pdf (application/pdf)
https://www.mdpi.com/2304-6775/9/2/15/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jpubli:v:9:y:2021:i:2:p:15-:d:530934
Access Statistics for this article
Publications is currently edited by Ms. Jennifer Zhang
More articles in Publications from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().