Proximal methods for sparse optimal scoring and discriminant analysis
Summer Atkins (),
Gudmundur Einarsson (),
Line Clemmensen () and
Brendan Ames ()
Additional contact information
Summer Atkins: Louisiana State University
Gudmundur Einarsson: deCODE Genetics
Line Clemmensen: Technical University of Denmark
Brendan Ames: University of Alabama
Advances in Data Analysis and Classification, 2023, vol. 17, issue 4, No 7, 983-1036
Abstract:
Abstract Linear discriminant analysis (LDA) is a classical method for dimensionality reduction, where discriminant vectors are sought to project data to a lower dimensional space for optimal separability of classes. Several recent papers have outlined strategies, based on exploiting sparsity of the discriminant vectors, for performing LDA in the high-dimensional setting where the number of features exceeds the number of observations in the data. However, many of these proposed methods lack scalable methods for solution of the underlying optimization problems. We consider an optimization scheme for solving the sparse optimal scoring formulation of LDA based on block coordinate descent. Each iteration of this algorithm requires an update of a scoring vector, which admits an analytic formula, and an update of the corresponding discriminant vector, which requires solution of a convex subproblem; we will propose several variants of this algorithm where the proximal gradient method or the alternating direction method of multipliers is used to solve this subproblem. We show that the per-iteration cost of these methods scales linearly in the dimension of the data provided restricted regularization terms are employed, and cubically in the dimension of the data in the worst case. Furthermore, we establish that when this block coordinate descent framework generates convergent subsequences of iterates, then these subsequences converge to the stationary points of the sparse optimal scoring problem. We demonstrate the effectiveness of our new methods with empirical results for classification of Gaussian data and data sets drawn from benchmarking repositories, including time-series and multispectral X-ray data, and provide Matlab and R implementations of our optimization schemes.
Keywords: Sparse discriminant analysis; Optimal scoring; Proximal gradient method; Alternating direction method of multipliers; 62H30; 62J12; 90C26 (search for similar items in EconPapers)
Date: 2023
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
http://link.springer.com/10.1007/s11634-022-00530-6 Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:advdac:v:17:y:2023:i:4:d:10.1007_s11634-022-00530-6
Ordering information: This journal article can be ordered from
http://www.springer. ... ds/journal/11634/PS2
DOI: 10.1007/s11634-022-00530-6
Access Statistics for this article
Advances in Data Analysis and Classification is currently edited by H.-H. Bock, W. Gaul, A. Okada, M. Vichi and C. Weihs
More articles in Advances in Data Analysis and Classification from Springer, German Classification Society - Gesellschaft für Klassifikation (GfKl), Japanese Classification Society (JCS), Classification and Data Analysis Group of the Italian Statistical Society (CLADAG), International Federation of Classification Societies (IFCS)
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().