SusGen-GPT: A Data-Centric LLM for Financial NLP and Sustainability Report Generation
Qilong Wu,
Xiaoneng Xiang,
Hejia Huang,
Xuan Wang,
Yeo Wei Jie,
Ranjan Satapathy,
Ricardo Shirota Filho and
Bharadwaj Veeravalli
Papers from arXiv.org
Abstract:
The rapid growth of the financial sector and the rising focus on Environmental, Social, and Governance (ESG) considerations highlight the need for advanced NLP tools. However, open-source LLMs proficient in both finance and ESG domains remain scarce. To address this gap, we introduce SusGen-30K, a category-balanced dataset comprising seven financial NLP tasks and ESG report generation, and propose TCFD-Bench, a benchmark for evaluating sustainability report generation. Leveraging this dataset, we developed SusGen-GPT, a suite of models achieving state-of-the-art performance across six adapted and two off-the-shelf tasks, trailing GPT-4 by only 2% despite using 7-8B parameters compared to GPT-4's 1,700B. Based on this, we propose the SusGen system, integrated with Retrieval-Augmented Generation (RAG), to assist in sustainability report generation. This work demonstrates the efficiency of our approach, advancing research in finance and ESG.
Date: 2024-12
New Economics Papers: this item is included in nep-ain, nep-cmp and nep-env
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
http://arxiv.org/pdf/2412.10906 Latest version (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:arx:papers:2412.10906
Access Statistics for this paper
More papers in Papers from arXiv.org
Bibliographic data for series maintained by arXiv administrators ().