Interpolating discriminant functions in high-dimensional Gaussian latent mixtures
Xin Bing and
Marten Wegkamp
Biometrika, 2024, vol. 111, issue 1, 291-308
Abstract:
This paper considers binary classification of high-dimensional features under a postulated model with a low-dimensional latent Gaussian mixture structure and nonvanishing noise. A generalized least-squares estimator is used to estimate the direction of the optimal separating hyperplane. The estimated hyperplane is shown to interpolate on the training data. While the direction vector can be consistently estimated, as could be expected from recent results in linear regression, a naive plug-in estimate fails to consistently estimate the intercept. A simple correction, which requires an independent hold-out sample, renders the procedure minimax optimal in many scenarios. The interpolation property of the latter procedure can be retained, but surprisingly depends on the way the labels are encoded.
Keywords: Benign overfitting; Discriminant analysis; Generalized least-squares estimate; High-dimensional classification; Minimax optimal rate of convergence; Overparameterization (search for similar items in EconPapers)
Date: 2024
References: Add references at CitEc
Citations:
Downloads: (external link)
http://hdl.handle.net/10.1093/biomet/asad037 (application/pdf)
Access to full text is restricted to subscribers.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:oup:biomet:v:111:y:2024:i:1:p:291-308.
Ordering information: This journal article can be ordered from
https://academic.oup.com/journals
Access Statistics for this article
Biometrika is currently edited by Paul Fearnhead
More articles in Biometrika from Biometrika Trust Oxford University Press, Great Clarendon Street, Oxford OX2 6DP, UK.
Bibliographic data for series maintained by Oxford University Press ().