Multitask Learning and Bandits via Robust Statistics

Xu, Kan; Bastani, Hamsa

Multitask Learning and Bandits via Robust Statistics

Kan Xu () and Hamsa Bastani ()
Additional contact information
Kan Xu: W. P. Carey School of Business, Arizona State University, Tempe, Arizona 85287
Hamsa Bastani: Wharton School, University of Pennsylvania, Philadelphia, Pennsylvania 19104

Management Science, 2025, vol. 71, issue 9, 7752-7773

Abstract: Decision makers often simultaneously face many related but heterogeneous learning problems. For instance, a large retailer may wish to learn product demand at different stores to solve pricing or inventory problems, making it desirable to learn jointly for stores serving similar customers; alternatively, a hospital network may wish to learn patient risk at different providers to allocate personalized interventions, making it desirable to learn jointly for hospitals serving similar patient populations. Motivated by real data sets, we study a natural setting where the unknown parameter in each learning instance can be decomposed into a shared global parameter plus a sparse instance-specific term. We propose a novel two-stage multitask learning estimator that exploits this structure in a sample-efficient way, using a unique combination of robust statistics (to learn across similar instances) and LASSO regression (to debias the results). Our estimator yields improved sample complexity bounds in the feature dimension d relative to commonly employed estimators; this improvement is exponential for “data-poor” instances, which benefit the most from multitask learning. We illustrate the utility of these results for online learning by embedding our multitask estimator within simultaneous contextual bandit algorithms. We specify a dynamic calibration of our estimator to appropriately balance the bias-variance trade-off over time, improving the resulting regret bounds in the context dimension d . Finally, we illustrate the value of our approach on synthetic and real data sets.

Keywords: multitask learning; transfer learning; robust statistics; LASSO; contextual bandits (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:

Downloads: (external link)
http://dx.doi.org/10.1287/mnsc.2022.00490 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:inm:ormnsc:v:71:y:2025:i:9:p:7752-7773

Access Statistics for this article

More articles in Management Science from INFORMS Contact information at EDIRC.
Bibliographic data for series maintained by Chris Asher ().