Adaptive Hard Parameter Sharing Method Based on Multi-Task Deep Learning

Wang, Hongxia; Jin, Xiao; Du, Yukun; Zhang, Nan; Hao, Hongxia

Adaptive Hard Parameter Sharing Method Based on Multi-Task Deep Learning

Hongxia Wang, Xiao Jin, Yukun Du, Nan Zhang and Hongxia Hao ()
Additional contact information
Hongxia Wang: School of Statistics and Data Science, Nanjing Audit University, Nanjing 211815, China
Xiao Jin: School of Statistics and Data Science, Nanjing Audit University, Nanjing 211815, China
Yukun Du: School of Statistics and Data Science, Nanjing Audit University, Nanjing 211815, China
Nan Zhang: School of Statistics and Data Science, Nanjing Audit University, Nanjing 211815, China
Hongxia Hao: School of Statistics and Data Science, Nanjing Audit University, Nanjing 211815, China

Mathematics, 2023, vol. 11, issue 22, 1-18

Abstract: Multi-task learning (MTL) improves the performance achieved on each task by exploiting the relevant information between tasks. At present, most of the mainstream deep MTL models are based on hard parameter sharing mechanisms, which can reduce the risk of model overfitting. However, negative knowledge transfer may occur, which hinders the performance improvement achieved for each task. In this paper, for situations when multiple tasks are jointly trained, we propose the adaptive hard parameter sharing method. On the basis of the adaptive hard parameter sharing method, the number of nodes in the network is dynamically updated by setting a continuous gradient difference-based sign threshold and a warm-up training iteration threshold through the relationships between the parameters and the loss function. After each task fully utilizes the shared information, adaptive nodes are used to further optimize each task, reducing the impact of negative migration. By using simulation studies and instance analyses, we demonstrate theoretical proof that the performance of the proposed method is better than that of the competing method.

Keywords: multi-task learning; continuous gradient difference threshold; warm-up; training iteration threshold; information sharing; adaptive nodes (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2023
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-7390/11/22/4639/pdf (application/pdf)
https://www.mdpi.com/2227-7390/11/22/4639/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:11:y:2023:i:22:p:4639-:d:1279666

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().