Prediction of tumour pathological subtype from genomic profile using sparse logistic regression with random effects
Özlem Kaymaz,
Khaled Alqahtani,
Henry M. Wood and
Arief Gusnanto
Journal of Applied Statistics, 2021, vol. 48, issue 4, 605-622
Abstract:
The purpose of this study is to highlight the application of sparse logistic regression models in dealing with prediction of tumour pathological subtypes based on lung cancer patients' genomic information. We consider sparse logistic regression models to deal with the high dimensionality and correlation between genomic regions. In a hierarchical likelihood (HL) method, it is assumed that the random effects follow a normal distribution and its variance is assumed to follow a gamma distribution. This formulation considers ridge and lasso penalties as special cases. We extend the HL penalty to include a ridge penalty (called ‘HLnet’) in a similar principle of the elastic net penalty, which is constructed from lasso penalty. The results indicate that the HL penalty creates more sparse estimates than lasso penalty with comparable prediction performance, while HLnet and elastic net penalties have the best prediction performance in real data. We illustrate the methods in a lung cancer study.
Date: 2021
References: Add references at CitEc
Citations:
Downloads: (external link)
http://hdl.handle.net/10.1080/02664763.2020.1738358 (text/html)
Access to full text is restricted to subscribers.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:taf:japsta:v:48:y:2021:i:4:p:605-622
Ordering information: This journal article can be ordered from
http://www.tandfonline.com/pricing/journal/CJAS20
DOI: 10.1080/02664763.2020.1738358
Access Statistics for this article
Journal of Applied Statistics is currently edited by Robert Aykroyd
More articles in Journal of Applied Statistics from Taylor & Francis Journals
Bibliographic data for series maintained by Chris Longhurst ().