ClimRetrieve: A Benchmarking Dataset for Information Retrieval from Corporate Climate Disclosures
Tobias Schimanski,
Jingwei Ni,
Roberto Spacey,
Nicola Ranger and
Markus Leippold
Additional contact information
Tobias Schimanski: University of Zurich
Jingwei Ni: ETH Zurich
Roberto Spacey: University of Oxford
Nicola Ranger: Environmental Change Institute, University of Oxford
Markus Leippold: University of Zurich; Swiss Finance Institute
No 24-89, Swiss Finance Institute Research Paper Series from Swiss Finance Institute
Abstract:
To handle the vast amounts of qualitative data produced in corporate climate communication, stakeholders increasingly rely on Retrieval Augmented Generation (RAG) systems. However, a significant gap remains in evaluating domain-specific information retrieval-the basis for answer generation. To address this challenge, this work simulates the typical tasks of a sustainability analyst by examining 30 sustainability reports with 16 detailed climate-related questions. As a result, we obtain a dataset with over 8.5K unique question-source-answer pairs labeled by different levels of relevance. Furthermore, we develop a use case with the dataset to investigate the integration of expert knowledge into information retrieval with embeddings. Although we show that incorporating expert knowledge works, we also outline the critical limitations of embeddings in knowledge-intensive downstream domains like climate change communication.
Pages: 17 pages
Date: 2024-07
New Economics Papers: this item is included in nep-ene and nep-env
References: Add references at CitEc
Citations:
Downloads: (external link)
https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4866498 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:chf:rpseri:rp2489
Access Statistics for this paper
More papers in Swiss Finance Institute Research Paper Series from Swiss Finance Institute Contact information at EDIRC.
Bibliographic data for series maintained by Ridima Mittal ().