How Big Should Your Data Really Be? Data-Driven Newsvendor: Learning One Sample at a Time

Besbes, Omar; Mouchtaki, Omar

How Big Should Your Data Really Be? Data-Driven Newsvendor: Learning One Sample at a Time

Omar Besbes () and Omar Mouchtaki ()
Additional contact information
Omar Besbes: Graduate School of Business, Columbia University, New York, New York 10027
Omar Mouchtaki: Graduate School of Business, Columbia University, New York, New York 10027

Management Science, 2023, vol. 69, issue 10, 5848-5865

Abstract: We study the classical newsvendor problem in which the decision maker must trade off underage and overage costs. In contrast to the typical setting, we assume that the decision maker does not know the underlying distribution driving uncertainty but has only access to historical data. In turn, the key questions are how to map existing data to a decision and what type of performance to expect as a function of the data size . We analyze the classical setting with access to past samples drawn from the distribution (e.g., past demand), focusing not only on asymptotic performance but also on what we call the transient regime of learning , that is, performance for arbitrary data sizes. We evaluate the performance of any algorithm through its worst-case relative expected regret, compared with an oracle with knowledge of the distribution. We provide the first finite sample exact analysis of the classical sample average approximation (SAA) algorithm for this class of problems across all data sizes. This allows one to uncover novel fundamental insights on the value of data: It reveals that tens of samples are sufficient to perform very efficiently but also that more data can lead to worse out-of-sample performance for SAA. We then focus on the general class of mappings from data to decisions without any restriction on the set of policies and derive an optimal algorithm (in the minimax sense) and characterize its associated performance. This leads to significant improvements for limited data sizes and allows to exactly quantify the value of historical information.

Keywords: limited data; data-driven decisions; minimax regret; sample average approximation; empirical optimization; finite samples; distributionally robust optimization (search for similar items in EconPapers)
Date: 2023
References: Add references at CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
http://dx.doi.org/10.1287/mnsc.2023.4725 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:inm:ormnsc:v:69:y:2023:i:10:p:5848-5865

Access Statistics for this article

More articles in Management Science from INFORMS Contact information at EDIRC.
Bibliographic data for series maintained by Chris Asher ().