Efficient and doubly-robust methods for variable selection and parameter estimation in longitudinal data analysis
Liya Fu (),
Zhuoran Yang (),
Fengjing Cai () and
You-Gan Wang ()
Additional contact information
Liya Fu: Xi’an Jiaotong University
Zhuoran Yang: Xi’an Jiaotong University
Fengjing Cai: Wenzhou University
You-Gan Wang: Queensland University of Technology
Computational Statistics, 2021, vol. 36, issue 2, No 1, 804 pages
Abstract:
Abstract New technologies have produced increasingly complex and massive datasets, such as next generation sequencing and microarray data in biology, dynamic treatment regimes in clinical trials and long-term wide-scale studies in the social sciences. Each study exhibits its unique data structure within individuals, clusters and possibly across time and space. In order to draw valid conclusion from such large dimensional data, we must account for intracluster correlations, varying cluster sizes, and outliers in response and/or covariate domains to achieve valid and efficient inferences. A weighted rank-based method is proposed for selecting variables and estimating parameters simultaneously. The main contribution of the proposed method is four fold: (1) variable selection using adaptive lasso is extended to robust rank regression so that protection against outliers in both response and predictor variables is obtained; (2) within-subject correlations are incorporated so that efficiency of parameter estimation is improved; (3) the computation is convenient via the existing function in statistical software R. (4) the proposed method is proved to have desirable asymptotic properties for fixed number of covariates (p). Simulation studies are carried out to evaluate the proposed method for a number of scenarios including the cases when p equals to the number of subjects. The simulation results indicate that the proposed method is efficient and robust. A hormone dataset is analyzed for illustration. By adding additional redundant variables as covariates, the penalty approach and weighting schemes are proven to be effective.
Keywords: Correlated data; Outliers; Rank-based method; Variable selection (search for similar items in EconPapers)
Date: 2021
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
http://link.springer.com/10.1007/s00180-020-01038-3 Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:compst:v:36:y:2021:i:2:d:10.1007_s00180-020-01038-3
Ordering information: This journal article can be ordered from
http://www.springer.com/statistics/journal/180/PS2
DOI: 10.1007/s00180-020-01038-3
Access Statistics for this article
Computational Statistics is currently edited by Wataru Sakamoto, Ricardo Cao and Jürgen Symanzik
More articles in Computational Statistics from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().