EconPapers    
Economics at your fingertips  
 

Learning and Optimization with Seasonal Patterns

Ningyuan Chen (), Chun Wang () and Longlin Wang ()
Additional contact information
Ningyuan Chen: Department of Management, University of Toronto Mississauga, Mississauga, Ontario L5L 1C6, Canada; and Rotman School of Management, University of Toronto, Canada, Toronto, Ontario M5S 3E6, Canada
Chun Wang: School of Economics and Management, Tsinghua University, Beijing 100190, China
Longlin Wang: School of Economics and Management, Tsinghua University, Beijing 100190, China; and Department of Statistics, Harvard University, Cambridge, Massachusetts 02138

Operations Research, 2025, vol. 73, issue 2, 894-909

Abstract: A standard assumption adopted in the multiarmed bandit (MAB) framework is that the mean rewards are constant over time. This assumption can be restrictive in the business world as decision makers often face an evolving environment in which the mean rewards are time-varying. In this paper, we consider a nonstationary MAB model with K arms whose mean rewards vary over time in a periodic manner. The unknown periods can be different across arms and scale with the length of the horizon T polynomially. We propose a two-stage policy that combines the Fourier analysis with a confidence bound–based learning procedure to learn the periods and minimize the regret. In stage one, the policy correctly estimates the periods of all arms with high probability. In stage two, the policy explores the periodic mean rewards of arms using the periods estimated in stage one and exploits the optimal arm in the long run. We show that our learning policy incurs a regret upper bound O ˜ ( T ∑ k = 1 K T k ) , where T k is the period of arm k . Moreover, we establish a general lower bound Ω ( T max k { T k } ) for any policy. Therefore, our policy is near optimal up to a factor of K .

Keywords: Stochastic; Models; multiarmed bandit; nonstationary; periodicity; seasonality; spectral analysis (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:

Downloads: (external link)
http://dx.doi.org/10.1287/opre.2023.0017 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:inm:oropre:v:73:y:2025:i:2:p:894-909

Access Statistics for this article

More articles in Operations Research from INFORMS Contact information at EDIRC.
Bibliographic data for series maintained by Chris Asher ().

 
Page updated 2025-04-05
Handle: RePEc:inm:oropre:v:73:y:2025:i:2:p:894-909