EconPapers    
Economics at your fingertips  
 

Two-stage optimization based on heterogeneous branch fusion for knowledge distillation

Gang Li, Pengfei Lv, Yang Zhang, Chuanyun Xu, Zihan Ruan, Zheng Zhou, Xinyu Fan, Ru Wang and Pan He

PLOS ONE, 2025, vol. 20, issue 7, 1-22

Abstract: Knowledge distillation transfers knowledge from the teacher model to the student model, effectively improving the performance of the student model. However, relying solely on the fixed knowledge of the teacher model for guidance lacks the supplementation and expansion of knowledge, which limits the generalization ability of the student model. Therefore, this paper proposes two-stage optimization based on heterogeneous branch fusion for knowledge distillation (THFKD), which provides appropriate knowledge to the student model in different stages through a two-stage optimization strategy. Specifically, the pre-trained teacher offers stable and comprehensive static knowledge, preventing the student from deviating from the target early in the training process. Meanwhile, the student model acquires rich feature representations through heterogeneous branches and a progressive feature fusion module, generating dynamically updated collaborative learning objectives, thus effectively enhancing the diversity of dynamic knowledge. Finally, in the first stage, the ramp-up weight gradually increases the loss weight within the period, while in the second stage, consistent loss weights are applied. The two-stage optimization strategy fully exploits the advantages of each type of knowledge, thereby improving the generalization ability of the student model. Although no tests of statistical significance were carried out, our experimental results on standard datasets (CIFAR-100, Tiny-ImageNet) and long-tail datasets (CIFAR100-LT) suggest that THFKD may slightly improve the student model’s classification accuracy and generalization ability. For instance, using ResNet110-ResNet32 on the CIFAR-100 dataset, the accuracy reaches 75.41%, a 1.52% improvement over the state-of-the-art (SOTA).

Date: 2025
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0326711 (text/html)
https://journals.plos.org/plosone/article/file?id= ... 26711&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pone00:0326711

DOI: 10.1371/journal.pone.0326711

Access Statistics for this article

More articles in PLOS ONE from Public Library of Science
Bibliographic data for series maintained by plosone ().

 
Page updated 2025-07-26
Handle: RePEc:plo:pone00:0326711