Concentration Based Inference in High Dimensional Generalized Regression Models (I: Statistical Guarantees)
Ying Zhu
MPRA Paper from University Library of Munich, Germany
Abstract:
We develop simple and non-asymptotically justified methods for hypothesis testing about the coefficients ($\theta^{*}\in\mathbb{R}^{p}$) in the high dimensional generalized regression models where $p$ can exceed the sample size. Given a function $h:\,\mathbb{R}^{p}\mapsto\mathbb{R}^{m}$, we consider $H_{0}:\,h(\theta^{*})=\mathbf{0}_{m}$ against $H_{1}:\,h(\theta^{*})\neq\mathbf{0}_{m}$, where $m$ can be any integer in $\left[1,\,p\right]$ and $h$ can be nonlinear in $\theta^{*}$. Our test statistics is based on the sample ``quasi score'' vector evaluated at an estimate $\hat{\theta}_{\alpha}$ that satisfies $h(\hat{\theta}_{\alpha})=\mathbf{0}_{m}$, where $\alpha$ is the prespecified Type I error. By exploiting the concentration phenomenon in Lipschitz functions, the key component reflecting the dimension complexity in our non-asymptotic thresholds uses a Monte-Carlo approximation to mimic the expectation that is concentrated around and automatically captures the dependencies between the coordinates. We provide probabilistic guarantees in terms of the Type I and Type II errors for the quasi score test. Confidence regions are also constructed for the population quasi-score vector evaluated at $\theta^{*}$. The first set of our results are specific to the standard Gaussian linear regression models; the second set allow for reasonably flexible forms of non-Gaussian responses, heteroscedastic noise, and nonlinearity in the regression coefficients, while only requiring the correct specification of $\mathbb{E}\left(Y_{i}|X_{i}\right)$s. The novelty of our methods is that their validity does not rely on good behavior of $\left\Vert \hat{\theta}_{\alpha}-\theta^{*}\right\Vert _{2}$ (or even $n^{-1/2}\left\Vert X\left(\hat{\theta}_{\alpha}-\theta^{*}\right)\right\Vert _{2}$ in the linear regression case) nonasymptotically or asymptotically.
Keywords: Nonasymptotic inference; concentration inequalities; high dimensional inference; hypothesis testing; confidence sets (search for similar items in EconPapers)
JEL-codes: C1 C12 C2 C21 (search for similar items in EconPapers)
Date: 2018-08-17
New Economics Papers: this item is included in nep-ecm and nep-ore
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://mpra.ub.uni-muenchen.de/88502/1/MPRA_paper_88502.pdf original version (application/pdf)
https://mpra.ub.uni-muenchen.de/89281/1/MPRA_paper_89281.pdf revised version (application/pdf)
https://mpra.ub.uni-muenchen.de/94645/1/MPRA_paper_94645.pdf revised version (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:pra:mprapa:88502
Access Statistics for this paper
More papers in MPRA Paper from University Library of Munich, Germany Ludwigstraße 33, D-80539 Munich, Germany. Contact information at EDIRC.
Bibliographic data for series maintained by Joachim Winter ().