Prediction of a Function of Misclassified Binary Data
Partha Lahiri () and
Noriah M. Al-Kandari ()
Statistics in Transition new series, 2016, vol. 17, issue 3, 429-447
Abstract:
We consider the problem of predicting a function of misclassified binary variables. We make an interesting observation that the naive predictor, which ignores the misclassification errors, is unbiased even if the total misclassification error is high as long as the probabilities of false positives and false negatives are identical. Other than this case, the bias of the naive predictor depends on the misclassification distribution and the magnitude of the bias can be high in certain cases. We correct the bias of the naive predictor using a double sampling idea where both inaccurate and accurate measurements are taken on the binary variable for all the units of a sample drawn from the original data using a probability sampling scheme. Using this additional information and design-based sample survey theory, we derive a biascorrected predictor. We examine the cases where the new bias-corrected predictors can also improve over the naive predictor in terms of mean square error (MSE).
Keywords: binary classification; double sampling; finite population sampling; misclassification; linkage error; sampling design (search for similar items in EconPapers)
Date: 2016
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
http://index.stat.gov.pl/repec/files/csb/stintr/csb_stintr_v17_2016_i3_n5.pdf (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:csb:stintr:v:17:y:2016:i:3:p:429-447
Access Statistics for this article
Statistics in Transition new series is currently edited by Włodzimierz Okrasa
More articles in Statistics in Transition new series from Główny Urząd Statystyczny (Polska) Contact information at EDIRC.
Bibliographic data for series maintained by Beata Witek ( this e-mail address is bad, please contact ).