Enhancing scientific literature summarization via contrastive learning and chain-of-thought prompting

Feng, Yu; An, Wenkang; Wang, Hao; Yin, Zhen

Enhancing scientific literature summarization via contrastive learning and chain-of-thought prompting

Yu Feng, Wenkang An, Hao Wang and Zhen Yin ()
Additional contact information
Yu Feng: Beijing Renhe Information Technology Co., Ltd.
Wenkang An: Beijing Renhe Information Technology Co., Ltd.
Hao Wang: Beijing Renhe Information Technology Co., Ltd.
Zhen Yin: Beijing Renhe Information Technology Co., Ltd.

Scientometrics, 2025, vol. 130, issue 8, No 22, 4773-4799

Abstract: Abstract The exponential growth of scientific literature presents a significant challenge for researchers to efficiently access and synthesize key information. Automatic summarization techniques have become essential for addressing this issue, enabling researchers to quickly grasp core content and key findings. However, the complexity and domain-specific nature of scientific texts demand high accuracy and contextual depth, which remain challenging for existing summarization models. This paper introduces a hierarchical summarization framework that integrates contrastive learning, document section classification, customized prompt-based summarization, and Chain-of-Thought (CoT) structured reasoning. Our approach first utilizes contrastive learning to enhance section classification, ensuring accurate content segmentation. Based on this classification, section-specific prompts are designed to generate targeted summaries, which are subsequently refined and aggregated through a CoT-based reasoning process to improve coherence and informativeness. We evaluate our method on the Sci-Summary dataset, comprising 20,000 scientific articles across multiple disciplines and languages. Experimental results demonstrate that our approach outperforms state-of-the-art baseline models, achieving notable improvements in ROUGE scores, BertScore, and evaluation using GPT-4o models (G-Eval). Furthermore, the results highlight the framework’s ability to preserve factual accuracy, enhance coherence, and improve the interpretability of generated summaries. These findings underscore the potential of our method in advancing scientific literature summarization, offering a scalable and effective solution for automated knowledge extraction in research domains.

Keywords: Automatic text summarization; Large language models; Contrastive learning; Chain-of-thought prompting (search for similar items in EconPapers)
Date: 2025
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://link.springer.com/10.1007/s11192-025-05397-w Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:scient:v:130:y:2025:i:8:d:10.1007_s11192-025-05397-w

Ordering information: This journal article can be ordered from
http://www.springer.com/economics/journal/11192

DOI: 10.1007/s11192-025-05397-w

Access Statistics for this article

Scientometrics is currently edited by Wolfgang Glänzel

More articles in Scientometrics from Springer, Akadémiai Kiadó
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().