Feature importance in linear models with ensemble machine learning: A study of the Fama and French five-factor model
Tae Yeon Kwon
Finance Research Letters, 2025, vol. 71, issue C
Abstract:
This study explores key considerations for interpreting feature influence and importance in Machine Learning (ML) for financial models that commonly assume linearity. Simulations demonstrate that ML techniques, including Random Forest, XGBoost, and CatBoost, may produce misleading feature importance ranks when the underlying model is linear. We empirically examine the Fama–French five-factor model using U.S. monthly data from July 1964 to June 2024. While the most important factors are consistently identified, the ranks of moderately important factors vary depending on the estimation method. These results highlight the need for a critical application of ML in financial modeling when the purpose is interpretability.
Keywords: Linear model; Feature importance; Machine learning; Ensemble (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S1544612324014351
Full text for ScienceDirect subscribers only
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:eee:finlet:v:71:y:2025:i:c:s1544612324014351
DOI: 10.1016/j.frl.2024.106406
Access Statistics for this article
Finance Research Letters is currently edited by R. Gençay
More articles in Finance Research Letters from Elsevier
Bibliographic data for series maintained by Catherine Liu ().