EconPapers    
Economics at your fingertips  
 

Active learning with biased non-response to label requests

Thomas Robinson, Niek Tax, Richard Mudd and Ido Guy

LSE Research Online Documents on Economics from London School of Economics and Political Science, LSE Library

Abstract: Active learning can improve the efficiency of training prediction models by identifying the most informative new labels to acquire. However, non-response to label requests can impact active learning’s effectiveness in real-world contexts. We conceptualise this degradation by considering the type of non-response present in the data, demonstrating that biased non-response is particularly detrimental to model performance. We argue that biased non-response is likely in contexts where the labelling process, by nature, relies on user interactions. To mitigate the impact of biased non-response, we propose a cost-based correction to the sampling strategy–the Upper Confidence Bound of the Expected Utility (UCB-EU)–that can, plausibly, be applied to any active learning algorithm. Through experiments, we demonstrate that our method successfully reduces the harm from labelling non-response in many settings. However, we also characterise settings where the non-response bias in the annotations remains detrimental under UCB-EU for specific sampling methods and data generating processes. Finally, we evaluate our method on a real-world dataset from an e-commerce platform. We show that UCB-EU yields substantial performance improvements to conversion models that are trained on clicked impressions. Most generally, this research serves to both better conceptualise the interplay between types of non-response and model improvements via active learning, and to provide a practical, easy-to-implement correction that mitigates model degradation.

Keywords: active learning; non-response; missing data; e-commerce; CTR prediction (search for similar items in EconPapers)
JEL-codes: L81 (search for similar items in EconPapers)
Pages: 24 pages
Date: 2024-07-01
New Economics Papers: this item is included in nep-ecm and nep-upt
References: Add references at CitEc
Citations:

Published in Data Mining and Knowledge Discovery, 1, July, 2024, 38(4), pp. 2117 - 2140. ISSN: 1384-5810

Downloads: (external link)
http://eprints.lse.ac.uk/123029/ Open access version. (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:ehl:lserod:123029

Access Statistics for this paper

More papers in LSE Research Online Documents on Economics from London School of Economics and Political Science, LSE Library LSE Library Portugal Street London, WC2A 2HD, U.K.. Contact information at EDIRC.
Bibliographic data for series maintained by LSERO Manager (lseresearchonline@lse.ac.uk).

 
Page updated 2024-12-28
Handle: RePEc:ehl:lserod:123029