High dimensional linear discriminant analysis: optimality, adaptive algorithm and missing data
T. Tony Cai and
Linjun Zhang
Journal of the Royal Statistical Society Series B, 2019, vol. 81, issue 4, 675-705
Abstract:
The paper develops optimality theory for linear discriminant analysis in the high dimensional setting. A data‐driven and tuning‐free classification rule, which is based on an adaptive constrained l1‐minimization approach, is proposed and analysed. Minimax lower bounds are obtained and this classification rule is shown to be simultaneously rate optimal over a collection of parameter spaces. In addition, we consider classification with incomplete data under the missingness completely at random model. An adaptive classifier with theoretical guarantees is introduced and the optimal rate of convergence for high dimensional linear discriminant analysis under the missingness completely at random model is established. The technical analysis for the case of missing data is much more challenging than that for complete data. We establish a large deviation result for the generalized sample covariance matrix, which serves as a key technical tool and can be of independent interest. An application to lung cancer and leukaemia studies is also discussed.
Date: 2019
References: Add references at CitEc
Citations: View citations in EconPapers (3)
Downloads: (external link)
https://doi.org/10.1111/rssb.12326
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:bla:jorssb:v:81:y:2019:i:4:p:675-705
Ordering information: This journal article can be ordered from
http://ordering.onli ... 1111/(ISSN)1467-9868
Access Statistics for this article
Journal of the Royal Statistical Society Series B is currently edited by P. Fryzlewicz and I. Van Keilegom
More articles in Journal of the Royal Statistical Society Series B from Royal Statistical Society Contact information at EDIRC.
Bibliographic data for series maintained by Wiley Content Delivery ().