Adaptive Chain-of-Thought Distillation Based on LLM Performance on Original Problems
Jianan Shen,
Xiaolong Cui (),
Zhiqiang Gao and
Xuanzhu Sheng
Additional contact information
Jianan Shen: School of Information Engineering, Chinese People’s Armed Police Force Engineering University, Xi’an 710086, China
Xiaolong Cui: School of Information Engineering, Chinese People’s Armed Police Force Engineering University, Xi’an 710086, China
Zhiqiang Gao: School of Information Engineering, Chinese People’s Armed Police Force Engineering University, Xi’an 710086, China
Xuanzhu Sheng: School of Information Engineering, Chinese People’s Armed Police Force Engineering University, Xi’an 710086, China
Mathematics, 2025, vol. 13, issue 22, 1-19
Abstract:
The chain-of-thought (CoT) approach in large language models (LLMs) has markedly enhanced their performance on complex tasks; however, effectively distilling this capability into LLMs with smaller parameter scales remains a challenge. Studies have found that small LLMs do not always benefit from CoT distillation. Inspired by the concept of teaching students in accordance with their aptitude, we propose an adaptive chain-of-thought distillation (ACoTD) framework. The core idea is to dynamically and adaptively customize distillation data and supervision signals for student models based on their performance on the original problems. Specifically, ACoTD initially evaluates and categorizes the original problems according to the capabilities of the student model. Subsequently, for Easy- and Medium-level problems, a short CoT distillation is employed for a brief lecture to reinforce knowledge and enhance training efficiency, for high-difficulty problems where the student model underperforms, and a detailed long CoT distillation is utilized for in-depth explanation to infuse richer reasoning logic. This differentiated distillation strategy ensures that student models achieve a better grasp of learning. We conducted experiments on multiple benchmark datasets. The results indicate that, compared to the baseline, our method can significantly improve the inference performance of small LLMs. Our method provides a new student-centered paradigm for knowledge distillation, demonstrating that adaptive adjustment of teaching strategies based on student feedback is an effective way to enhance small LLMs’ reasoning ability.
Keywords: large language model; chain of thought; knowledge distillation (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
https://www.mdpi.com/2227-7390/13/22/3646/pdf (application/pdf)
https://www.mdpi.com/2227-7390/13/22/3646/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:13:y:2025:i:22:p:3646-:d:1794274
Access Statistics for this article
Mathematics is currently edited by Ms. Emma He
More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().