EconPapers    
Economics at your fingertips  
 

Conditional characteristic feature screening for massive imbalanced data

Ping Wang and Lu Lin ()
Additional contact information
Ping Wang: Shandong University
Lu Lin: Shandong University

Statistical Papers, 2023, vol. 64, issue 3, No 4, 807-834

Abstract: Abstract Using conditional characteristic function as a screening index, a new model-free screening procedure is proposed to deal with variable screening problems in large-scale high-dimensional imbalanced data analysis. For binary response, our results show that the screening index under full data is proportional to the screening index under case–control sampling, an important sampling property for imbalanced data. This conclusion implies that we can apply this screening method to imbalanced data. Surely, the most appealing feature of the screening index is that it can be expressed as a simple linear combination of two first-order moments, so it is computationally simple. In addition, we successfully extend this method to multiple response. The theoretical properties are established under regularity conditions. To compare the performance of our method with its competitors, extensive simulations are conducted, which shows that the proposed procedure performs well in both the linear and nonlinear models. Finally, a real data analysis is investigated to further illustrate the effectiveness of the new method.

Keywords: Ultrahigh dimensionality; Massive imbalanced data; Model-free; Conditional characteristic screening; Case–control sampling (search for similar items in EconPapers)
Date: 2023
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://link.springer.com/10.1007/s00362-022-01342-8 Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:stpapr:v:64:y:2023:i:3:d:10.1007_s00362-022-01342-8

Ordering information: This journal article can be ordered from
http://www.springer. ... business/journal/362

DOI: 10.1007/s00362-022-01342-8

Access Statistics for this article

Statistical Papers is currently edited by C. Müller, W. Krämer and W.G. Müller

More articles in Statistical Papers from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-20
Handle: RePEc:spr:stpapr:v:64:y:2023:i:3:d:10.1007_s00362-022-01342-8