Classifying ultra‐short scientific texts using a hybrid hierarchical multi‐label classification framework
Dengsheng Wu,
Huidong Wu,
Fan Meng and
Jianping Li
Journal of the Association for Information Science & Technology, 2025, vol. 76, issue 12, 1625-1646
Abstract:
Scientific text classification is essential for efficiently organizing and assimilating scientific knowledge. However, existing methods struggle to classify ultra‐short scientific texts due to their limited content and complex hierarchical labeling. To overcome these challenges, we introduce the BERT‐HMCN framework, which combines Bidirectional Encoder Representations from Transformers (BERT) with a Hierarchical Multi‐label Classification Network (HMCN). This framework introduces a novel level‐fixed fine‐tuning strategy that strengthens the connection between text semantics and hierarchical labels, enhancing the representation of ultra‐short texts. We evaluated BERT‐HMCN's performance on a dataset of 75,065 program titles from the National Natural Science Foundation of China. Our results show that BERT‐HMCN outperforms existing models in both overall performance and hierarchical accuracy. We also conducted a comparative analysis with autoregressive large language models (LLMs), illustrating the strengths of each in different contexts. Further analysis confirms the effectiveness and robustness of the BERT‐HMCN framework. We discuss its theoretical contributions and practical applications, underscoring the broader implications of these results in scientific text classification and other related fields.
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
https://doi.org/10.1002/asi.70018
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:bla:jinfst:v:76:y:2025:i:12:p:1625-1646
Ordering information: This journal article can be ordered from
http://www.blackwell ... bs.asp?ref=2330-1635
Access Statistics for this article
More articles in Journal of the Association for Information Science & Technology from Association for Information Science & Technology
Bibliographic data for series maintained by Wiley Content Delivery ().