EconPapers    
Economics at your fingertips  
 

Benchmarking Machine Learning Models for ESG Prediction in South Korea Using News-Derived Time Series

Kim Yunwoo and Junhyuk Hwang

No v2738_v1, SocArXiv from Center for Open Science

Abstract: Existing ESG ratings have limitations like disclosure delays, inconsistencies, and uneven coverage, particularly in non-English markets. This paper addresses these issues by establishing the first machine learning benchmark for ESG prediction in the Korean market using news-derived time-series features. A standardized dataset of 278 Korean firms was constructed, and monthly sentiment and ESG-relevance features were generated from news using Korean-specific language models. A mask-aware CNN explicitly handles missing data by distinguishing observed months from imputed ones. The model achieved a Mean Absolute Error (MAE) of 17.9, a Root Mean Squared Error (RMSE) of 22.0, an 𝑅2 of 0.12, and a Spearman’s 𝜌 of 0.38, demonstrating that temporal modeling and explicit handling of missing data are crucial for improving predictive accuracy.

Date: 2025-09-12
New Economics Papers: this item is included in nep-cmp
References: Add references at CitEc
Citations:

Downloads: (external link)
https://osf.io/download/68c3c1a9e33eca3b0feff8de/

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:osf:socarx:v2738_v1

DOI: 10.31219/osf.io/v2738_v1

Access Statistics for this paper

More papers in SocArXiv from Center for Open Science
Bibliographic data for series maintained by OSF ().

 
Page updated 2025-10-01
Handle: RePEc:osf:socarx:v2738_v1