Lottery Rank-Pruning Adaptation Parameter Efficient Fine-Tuning

Kim, Juhyeong; Kim, Gyunyeop; Kang, Sangwoo

Lottery Rank-Pruning Adaptation Parameter Efficient Fine-Tuning

Juhyeong Kim, Gyunyeop Kim () and Sangwoo Kang ()
Additional contact information
Juhyeong Kim: School of Computing, Gachon University, 1342, Seongnam-daero, Sujeong-gu, Seongnam-si 13120, Republic of Korea
Gyunyeop Kim: School of Computing, Gachon University, 1342, Seongnam-daero, Sujeong-gu, Seongnam-si 13120, Republic of Korea
Sangwoo Kang: School of Computing, Gachon University, 1342, Seongnam-daero, Sujeong-gu, Seongnam-si 13120, Republic of Korea

Mathematics, 2024, vol. 12, issue 23, 1-15

Abstract: Recent studies on parameter-efficient fine-tuning (PEFT) have introduced effective and efficient methods for fine-tuning large language models (LLMs) on downstream tasks using fewer parameters than required by full fine-tuning. Low-rank decomposition adaptation (LoRA) significantly reduces the parameter count to 0.03% of that in full fine-tuning, maintaining satisfactory performance when training only two low-rank parameters. However, limitations remain due to the lack of task-specific parameters involved in training. To mitigate these issues, we propose the Lottery Rank-Pruning Adaptation (LoRPA) method, which utilizes the Lottery Ticket Hypothesis to prune less significant parameters based on their magnitudes following initial training. Initially, LoRPA trains with a relatively large rank size and then applies pruning to enhance performance in subsequent training with fewer parameters. We conducted experiments to compare LoRPA with LoRA baselines, including a setting with a relatively large rank size. Experimental results on the GLUE dataset with RoBERTa demonstrate that LoRPA achieves comparable results on the base scale while outperforming LoRA with various rank sizes by 0.04% to 0.74% on a large scale across multiple tasks. Additionally, on generative summarization tasks using BART-base on the CNN/DailyMail and XSum datasets, LoRPA outperformed LoRA at the standard rank size and other PEFT methods in most of the metrics. These results validate the efficacy of lottery pruning for LoRA in downstream natural-language understanding and generation tasks.

Keywords: low-rank adaptation; parameter efficient finetuning; transfer learning; large language model; deep learning (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2024
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-7390/12/23/3744/pdf (application/pdf)
https://www.mdpi.com/2227-7390/12/23/3744/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:12:y:2024:i:23:p:3744-:d:1531567

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().