Robust statistical boosting with quantile-based adaptive loss functions
Speller Jan (),
Staerk Christian () and
Mayr Andreas ()
Additional contact information
Speller Jan: Medical Faculty, Institute of Medical Biometrics, Informatics and Epidemiology (IMBIE), University of Bonn, Bonn, Germany
Staerk Christian: Medical Faculty, Institute of Medical Biometrics, Informatics and Epidemiology (IMBIE), University of Bonn, Bonn, Germany
Mayr Andreas: Medical Faculty, Institute of Medical Biometrics, Informatics and Epidemiology (IMBIE), University of Bonn, Bonn, Germany
The International Journal of Biostatistics, 2023, vol. 19, issue 1, 111-129
Abstract:
We combine robust loss functions with statistical boosting algorithms in an adaptive way to perform variable selection and predictive modelling for potentially high-dimensional biomedical data. To achieve robustness against outliers in the outcome variable (vertical outliers), we consider different composite robust loss functions together with base-learners for linear regression. For composite loss functions, such as the Huber loss and the Bisquare loss, a threshold parameter has to be specified that controls the robustness. In the context of boosting algorithms, we propose an approach that adapts the threshold parameter of composite robust losses in each iteration to the current sizes of residuals, based on a fixed quantile level. We compared the performance of our approach to classical M-regression, boosting with standard loss functions or the lasso regarding prediction accuracy and variable selection in different simulated settings: the adaptive Huber and Bisquare losses led to a better performance when the outcome contained outliers or was affected by specific types of corruption. For non-corrupted data, our approach yielded a similar performance to boosting with the efficient L 2 loss or the lasso. Also in the analysis of skewed KRT19 protein expression data based on gene expression measurements from human cancer cell lines (NCI-60 cell line panel), boosting with the new adaptive loss functions performed favourably compared to standard loss functions or competing robust approaches regarding prediction accuracy and resulted in very sparse models.
Keywords: Bisquare loss; gradient boosting; Huber loss; robust regression (search for similar items in EconPapers)
Date: 2023
References: Add references at CitEc
Citations:
Downloads: (external link)
https://doi.org/10.1515/ijb-2021-0127 (text/html)
For access to full text, subscription to the journal or payment for the individual article is required.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:bpj:ijbist:v:19:y:2023:i:1:p:111-129:n:12
Ordering information: This journal article can be ordered from
https://www.degruyter.com/journal/key/ijb/html
DOI: 10.1515/ijb-2021-0127
Access Statistics for this article
The International Journal of Biostatistics is currently edited by Antoine Chambaz, Alan E. Hubbard and Mark J. van der Laan
More articles in The International Journal of Biostatistics from De Gruyter
Bibliographic data for series maintained by Peter Golla ().