Decoupled neural network training with re-computation and weight prediction
Jiawei Peng,
Yicheng Xu,
Zhiping Lin,
Zhenyu Weng,
Zishuo Yang and
Huiping Zhuang
PLOS ONE, 2023, vol. 18, issue 2, 1-23
Abstract:
To break the three lockings during backpropagation (BP) process for neural network training, multiple decoupled learning methods have been investigated recently. These methods either lead to significant drop in accuracy performance or suffer from dramatic increase in memory usage. In this paper, a new form of decoupled learning, named decoupled neural network training scheme with re-computation and weight prediction (DTRP) is proposed. In DTRP, a re-computation scheme is adopted to solve the memory explosion problem, and a weight prediction scheme is proposed to deal with the weight delay caused by re-computation. Additionally, a batch compensation scheme is developed, allowing the proposed DTRP to run faster. Theoretical analysis shows that DTRP is guaranteed to converge to crical points under certain conditions. Experiments are conducted by training various convolutional neural networks on several classification datasets, showing comparable or better results than the state-of-the-art methods and BP. These experiments also reveal that adopting the proposed method, the memory explosion problem is effectively solved, and a significant acceleration is achieved.
Date: 2023
References: Add references at CitEc
Citations:
Downloads: (external link)
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0276427 (text/html)
https://journals.plos.org/plosone/article/file?id= ... 76427&type=printable (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:plo:pone00:0276427
DOI: 10.1371/journal.pone.0276427
Access Statistics for this article
More articles in PLOS ONE from Public Library of Science
Bibliographic data for series maintained by plosone ().