EconPapers    
Economics at your fingertips  
 

CiteFusion: an ensemble framework for citation intent classification harnessing dual-model binary couples and SHAP analyses

Lorenzo Paolini (), Sahar Vahdati, Angelo Di Iorio, Robert Wardenga, Ivan Heibi and Silvio Peroni
Additional contact information
Lorenzo Paolini: University of Bologna, Department of Computer Science and Engineering
Sahar Vahdati: Technical University of Dresden, Nature-Inspired Machine Intelligence Group, SCaDS.AI Center
Angelo Di Iorio: University of Bologna, Department of Computer Science and Engineering
Robert Wardenga: Institute for Applied Computer Science, InfAI
Ivan Heibi: University of Bologna, Department of Classical Philology and Italian Studies, Research Centre for Open Scholarly Metadata
Silvio Peroni: University of Bologna, Department of Classical Philology and Italian Studies, Research Centre for Open Scholarly Metadata

Scientometrics, 2025, vol. 130, issue 11, No 3, 5981 pages

Abstract: Abstract Understanding the motivations underlying scholarly citations is essential to evaluate research impact and promote transparent scholarly communication. This study introduces CiteFusion, an ensemble framework designed to address the multi-class Citation Intent Classification task on two benchmark datasets: SciCite and ACL-ARC. The framework employs a one-vs-all decomposition of the multi-class task into class-specific binary subtasks, leveraging complementary pairs of SciBERT and XLNet models, independently tuned, for each citation intent. The outputs of these base models are aggregated through a feedforward neural network meta-classifier to reconstruct the original classification task. To enhance interpretability, SHAP (SHapley Additive exPlanations) is employed to analyze token-level contributions, and interactions among base models, providing transparency into the classification dynamics of CiteFusion, and insights about the kind of misclassifications of the ensemble. In addition, this work investigates the semantic role of structural context by incorporating section titles, as framing devices, into input sentences, assessing their positive impact on classification accuracy. CiteFusion ultimately demonstrates robust performance in imbalanced and data-scarce scenarios: experimental results show that CiteFusion achieves state-of-the-art performance, with Macro-F1 scores of 89.60% on SciCite, and 76.24% on ACL-ARC. Furthermore, to ensure interoperability and reusability, citation intents from both datasets schemas are mapped to Citation Typing Ontology (CiTO) object properties, highlighting some overlaps. Finally, we describe and release a web-based application that classifies citation intents leveraging the CiteFusion models developed on SciCite.

Keywords: Citation Intent Classification; Language Models; Ensemble Strategies; Explainable AI (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:

Downloads: (external link)
http://link.springer.com/10.1007/s11192-025-05418-8 Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:scient:v:130:y:2025:i:11:d:10.1007_s11192-025-05418-8

Ordering information: This journal article can be ordered from
http://www.springer.com/economics/journal/11192

DOI: 10.1007/s11192-025-05418-8

Access Statistics for this article

Scientometrics is currently edited by Wolfgang Glänzel

More articles in Scientometrics from Springer, Akadémiai Kiadó
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-12-05
Handle: RePEc:spr:scient:v:130:y:2025:i:11:d:10.1007_s11192-025-05418-8