Feature selection methods for Cox proportional hazards model. Comparative study for financial and medical survival data
Wojciech Skwirz ()
Additional contact information
Wojciech Skwirz: Warsaw School of Economics
Bank i Kredyt, 2025, vol. 56, issue 1, 113-138
Abstract:
This study compares Cox proportional hazards models across medical and financial datasets built using various feature selection techniques. In this analysis 8 feature selection techniques (3 variants of forward selection, 2 variants of a selection based on principal component analysis, selection based on random survival forest, best subset selection and a selection based on a LASSO regularization) were tested across 22 multidimensional datasets (2 financial and 20 medical). The resulting Cox models were compared based on a concordance index. The main hypothesis of this study stating that the LASSO regularization or the selection based on random survival forest method (generating good models for medical data) would yield similar performance on financial data, was hereby disproved. The forward Schwarz and best subset selection gave the best results for financial data, while LASSO and random survival forest proved to be the most efficient in medical setting, for each considered model size.
Keywords: survival analysis; feature selection; credit risk; financial data; medical data (search for similar items in EconPapers)
JEL-codes: C24 C41 C52 C55 G21 I10 (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
https://bankikredyt.nbp.pl/content/2025/01/bik_01_2025_04.pdf (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:nbp:nbpbik:v:56:y:2025:i:1:p:113-138
Access Statistics for this article
More articles in Bank i Kredyt from Narodowy Bank Polski Contact information at EDIRC.
Bibliographic data for series maintained by Wojciech Burjanek ().