Enhancing scientific literature summarization via contrastive learning and chain-of-thought prompting
Yu Feng,
Wenkang An,
Hao Wang and
Zhen Yin ()
Additional contact information
Yu Feng: Beijing Renhe Information Technology Co., Ltd.
Wenkang An: Beijing Renhe Information Technology Co., Ltd.
Hao Wang: Beijing Renhe Information Technology Co., Ltd.
Zhen Yin: Beijing Renhe Information Technology Co., Ltd.
Scientometrics, 2025, vol. 130, issue 8, No 22, 4773-4799
Abstract:
Abstract The exponential growth of scientific literature presents a significant challenge for researchers to efficiently access and synthesize key information. Automatic summarization techniques have become essential for addressing this issue, enabling researchers to quickly grasp core content and key findings. However, the complexity and domain-specific nature of scientific texts demand high accuracy and contextual depth, which remain challenging for existing summarization models. This paper introduces a hierarchical summarization framework that integrates contrastive learning, document section classification, customized prompt-based summarization, and Chain-of-Thought (CoT) structured reasoning. Our approach first utilizes contrastive learning to enhance section classification, ensuring accurate content segmentation. Based on this classification, section-specific prompts are designed to generate targeted summaries, which are subsequently refined and aggregated through a CoT-based reasoning process to improve coherence and informativeness. We evaluate our method on the Sci-Summary dataset, comprising 20,000 scientific articles across multiple disciplines and languages. Experimental results demonstrate that our approach outperforms state-of-the-art baseline models, achieving notable improvements in ROUGE scores, BertScore, and evaluation using GPT-4o models (G-Eval). Furthermore, the results highlight the framework’s ability to preserve factual accuracy, enhance coherence, and improve the interpretability of generated summaries. These findings underscore the potential of our method in advancing scientific literature summarization, offering a scalable and effective solution for automated knowledge extraction in research domains.
Keywords: Automatic text summarization; Large language models; Contrastive learning; Chain-of-thought prompting (search for similar items in EconPapers)
Date: 2025
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
http://link.springer.com/10.1007/s11192-025-05397-w Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:scient:v:130:y:2025:i:8:d:10.1007_s11192-025-05397-w
Ordering information: This journal article can be ordered from
http://www.springer.com/economics/journal/11192
DOI: 10.1007/s11192-025-05397-w
Access Statistics for this article
Scientometrics is currently edited by Wolfgang Glänzel
More articles in Scientometrics from Springer, Akadémiai Kiadó
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().