EconPapers    
Economics at your fingertips  
 

Binary surrogates with stratified samples when weights are unknown

Yu-Min Huang ()
Additional contact information
Yu-Min Huang: Tunghai University

Computational Statistics, 2019, vol. 34, issue 2, No 12, 653-682

Abstract: Abstract In clinical practice, surrogate variables are commonly used as an indirect measure when it is difficult or expensive to measure the primary outcome variable X, based on which the disease status is assessed. In this article, we consider the problem of constructing an optimal binary surrogate Y to substitute such the feature variable X. To retain samples that have rare values in X, the paired sample (X, Y) is usually selected based on stratified sampling, where the strata are constructed using the disjoint intervals with the support of X. For such a sampling design, the stratum proportions are usually unknown such that proportional allocation is infeasible and (X, Y)’s cannot be regarded as an i.i.d. sample between strata. We estimate the unknown cutoff determining higher/lower levels of X that optimally match the variable Y and provide the true positive rates (TPR) adjusted for the disproportionate stratum weights. Our approach is to estimate the underlying distribution of X, then conduct an ad-hoc estimation for the TPR and for the expected prediction errors under zero-one loss function. We develop parametric estimate of the distribution of X under exponential family assumption and a weighted-kernel density estimator when the distribution of X is unspecified. We illustrate our methods on various simulation studies and on a real example where binary surrogates were evaluated for a medical device. The simulation results indicate that our approach performs well.

Keywords: Surrogate variable; Biased sampling; Logistic model; Binary classification; Composite likelihood; Kernel density; Optimal cutoff values (search for similar items in EconPapers)
Date: 2019
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
http://link.springer.com/10.1007/s00180-018-0838-3 Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:compst:v:34:y:2019:i:2:d:10.1007_s00180-018-0838-3

Ordering information: This journal article can be ordered from
http://www.springer.com/statistics/journal/180/PS2

DOI: 10.1007/s00180-018-0838-3

Access Statistics for this article

Computational Statistics is currently edited by Wataru Sakamoto, Ricardo Cao and Jürgen Symanzik

More articles in Computational Statistics from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-03-20
Handle: RePEc:spr:compst:v:34:y:2019:i:2:d:10.1007_s00180-018-0838-3