# Prediction of a Function of Misclassified Binary Data

*Partha Lahiri* () and
*Noriah M. Al-Kandari* ()

*Statistics in Transition new series*, 2016, vol. 17, issue 3, 429-447

**Abstract:**
We consider the problem of predicting a function of misclassified binary variables. We make an interesting observation that the naive predictor, which ignores the misclassification errors, is unbiased even if the total misclassification error is high as long as the probabilities of false positives and false negatives are identical. Other than this case, the bias of the naive predictor depends on the misclassification distribution and the magnitude of the bias can be high in certain cases. We correct the bias of the naive predictor using a double sampling idea where both inaccurate and accurate measurements are taken on the binary variable for all the units of a sample drawn from the original data using a probability sampling scheme. Using this additional information and design-based sample survey theory, we derive a biascorrected predictor. We examine the cases where the new bias-corrected predictors can also improve over the naive predictor in terms of mean square error (MSE).

**Keywords:** binary classification; double sampling; finite population sampling; misclassification; linkage error; sampling design (search for similar items in EconPapers)

**Date:** 2016

**References:** View references in EconPapers View complete reference list from CitEc

**Citations:** Track citations by RSS feed

**Downloads:** (external link)

http://index.stat.gov.pl/repec/files/csb/stintr/csb_stintr_v17_2016_i3_n5.pdf (application/pdf)

**Related works:**

This item may be available elsewhere in EconPapers: Search for items with the same title.

**Export reference:** BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text

**Persistent link:** https://EconPapers.repec.org/RePEc:csb:stintr:v:17:y:2016:i:3:p:429-447

Access Statistics for this article

Statistics in Transition new series is currently edited by *Włodzimierz Okrasa*

More articles in Statistics in Transition new series from Główny Urząd Statystyczny (Polska) Contact information at EDIRC.

Bibliographic data for series maintained by Beata Witek ().