EconPapers    
Economics at your fingertips  
 

Feature screening for ultrahigh dimensional categorical data with covariates missing at random

Lyu Ni, Fang Fang and Jun Shao

Computational Statistics & Data Analysis, 2020, vol. 142, issue C

Abstract: Most existing feature screening methods assume that data are fully observed. It is quite a challenge to develop screening methods for incomplete data since the traditional missing data analysis techniques cannot be directly applied to ultrahigh dimensional case. A two-step model-free feature screening procedure for ultrahigh dimensional categorical data when some covariate values are missing at random is developed. For each covariate with missing data, the first step screens out the variables in the unspecified propensity function. In the second step, screening statistics such as the adjusted Pearson Chi-Square statistics can be calculated by leveraging the variables obtained in the first step and the special structure of categorical data. Sure screening properties are established for the proposed method. Finite sample performance is investigated by simulation studies and a real data example.

Keywords: Feature screening; Missing at random; Missing covariate; Pearson Chi-Square statistic; Sure screening property (search for similar items in EconPapers)
Date: 2020
References: Add references at CitEc
Citations:

Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0167947319301719
Full text for ScienceDirect subscribers only.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:eee:csdana:v:142:y:2020:i:c:s0167947319301719

DOI: 10.1016/j.csda.2019.106824

Access Statistics for this article

Computational Statistics & Data Analysis is currently edited by S.P. Azen

More articles in Computational Statistics & Data Analysis from Elsevier
Bibliographic data for series maintained by Catherine Liu ().

 
Page updated 2025-03-19
Handle: RePEc:eee:csdana:v:142:y:2020:i:c:s0167947319301719