EconPapers    
Economics at your fingertips  
 

Wasserstein filter for variable screening in binary classification in the reproducing kernel Hilbert space

Sanghun Jeong, Choongrak Kim and Hojin Yang

Journal of Nonparametric Statistics, 2024, vol. 36, issue 3, 623-642

Abstract: The aim of this paper is to develop a marginal screening method for variable screening in high-dimensional binary classification based on the Wasserstein distance accounting for the distributional difference. Many existing screening methods, such as the two-sample t-test and Kolmogorov test, have been developed under the parametric/nonparametric modeling assumptions to reduce the dimension of the predictors. However, such modeling specifications or nonparametric approaches are associated with the probability measure induced by the predictor in a Euclidean space. While many machine learning methods have successfully found the nonlinear decision boundary in the transformed space, called the reproducing kernel Hilbert space (RKHS), we consider the Wasserstein filter's capacity to detect the distributional difference between two probability measures induced by the nonlinear function of the predictor in the RKHS. Thereby, we can flexibly filter out the non-informative predictors associated with the binary classification, as well as escape the modeling assumptions required in a Euclidean space. We prove that the Wasserstein filter satisfies the sure screening property under some mild conditions. We also demonstrate the advantages of our proposed approach by comparing the finite sample performance of it with those of the existing choices through simulation studies, as well as through application to lung cancer data.

Date: 2024
References: Add references at CitEc
Citations:

Downloads: (external link)
http://hdl.handle.net/10.1080/10485252.2023.2235430 (text/html)
Access to full text is restricted to subscribers.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:taf:gnstxx:v:36:y:2024:i:3:p:623-642

Ordering information: This journal article can be ordered from
http://www.tandfonline.com/pricing/journal/GNST20

DOI: 10.1080/10485252.2023.2235430

Access Statistics for this article

Journal of Nonparametric Statistics is currently edited by Jun Shao

More articles in Journal of Nonparametric Statistics from Taylor & Francis Journals
Bibliographic data for series maintained by Chris Longhurst ().

 
Page updated 2025-03-20
Handle: RePEc:taf:gnstxx:v:36:y:2024:i:3:p:623-642