QDA classification of high-dimensional data with rare and weak signals
Hanning Chen,
Qiang Zhao and
Jingjing Wu ()
Additional contact information
Hanning Chen: University of Melbourne
Qiang Zhao: Shandong Normal University
Jingjing Wu: University of Calgary
Advances in Data Analysis and Classification, 2025, vol. 19, issue 1, No 3, 65 pages
Abstract:
Abstract This paper addresses the two-class classification problem for data with rare and weak signals, under the modern high-dimension setup $$p>>n$$ p > > n . Considering the two-component mixture of Gaussian features with different random mean vector of rare and weak signals but common covariance matrix (homoscedastic Gaussian), Fan (AS 41:2537-2571, 2013) investigated the optimality of linear discriminant analysis (LDA) and proposed an efficient variable selection and classification procedure. We extend their work by incorporating the more general scenario that the two components have different random covariance matrices with difference of rare and weak signals, in order to assess the effect of difference in covariance matrix on classification. Under this model, we investigated the behaviour of quadratic discriminant analysis (QDA) classifier. In theoretical aspect, we derived the successful and unsuccessful classification regions of QDA. For data of rare signals, variable selection will mostly improve the performance of statistical procedures. Thus in implementation aspect, we proposed a variable selection procedure for QDA based on the Higher Criticism Thresholding (HCT) that was proved efficient for LDA. In addition, we conducted extensive simulation studies to demonstrate the successful and unsuccessful classification regions of QDA and evaluate the effectiveness of the proposed HCT thresholded QDA.
Keywords: Quadratic discriminant analysis; Two-component mixture models; Rare and weak signals; Classification; Variable selection; Higher Criticism Thresholding; Primary 62H30; 62R07; Secondary 62F10; 62F12 (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
http://link.springer.com/10.1007/s11634-023-00576-0 Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:advdac:v:19:y:2025:i:1:d:10.1007_s11634-023-00576-0
Ordering information: This journal article can be ordered from
http://www.springer. ... ds/journal/11634/PS2
DOI: 10.1007/s11634-023-00576-0
Access Statistics for this article
Advances in Data Analysis and Classification is currently edited by H.-H. Bock, W. Gaul, A. Okada, M. Vichi and C. Weihs
More articles in Advances in Data Analysis and Classification from Springer, German Classification Society - Gesellschaft für Klassifikation (GfKl), Japanese Classification Society (JCS), Classification and Data Analysis Group of the Italian Statistical Society (CLADAG), International Federation of Classification Societies (IFCS)
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().