Temporal Fairness in Learning and Earning: Price Protection Guarantee and Phase Transitions
Qing Feng (),
Ruihao Zhu () and
Stefanus Jasin ()
Additional contact information
Qing Feng: School of Operations Research and Information Engineering, Cornell University, Ithaca, New York 14853
Ruihao Zhu: SC Johnson College of Business, Cornell University, Ithaca, New York 14853
Stefanus Jasin: Stephen M. Ross School of Business, University of Michigan, Ann Arbor, Michigan 48109
Operations Research, 2025, vol. 73, issue 2, 775-797
Abstract:
Motivated by the prevalence of price protection guarantee which helps to promote temporal fairness in dynamic pricing, we study the impact of such policy on the design of online learning algorithms for data-driven dynamic pricing with initially unknown customer demand. Under the price protection guarantee, a customer who purchased a product in the past can receive a refund from the seller during the so-called price protection period (typically defined as a certain time window after the purchase date) in case the seller decides to lower the price. We consider a setting where a firm sells a product over a horizon of T time steps. For this setting, we characterize how the value of M , the length of the price protection period, can affect the optimal regret of the learning process. We derive the optimal regret by first establishing a fundamental impossible regime with the novel refund-aware regret lower bound analysis. Then, we propose LEAP , a phased exploration type algorithm for Learning and EArning under Price Protection, to match this lower bound up to logarithmic factors or even doubly logarithmic factors (when there are only two prices available to the seller). Our results reveal the surprising phase transitions of the optimal regret with respect to M . Specifically, when M is not too large, the optimal regret has no major difference when compared with that of the classic setting with no price protection guarantee. In addition, there also exists an upper limit on how much the optimal regret can deteriorate when M grows large. Finally, we conduct extensive numerical simulations with both synthetic and real-world data sets to show the benefit of LEAP over other heuristic methods for this problem. The numerical results suggest that under certain realistic assumptions, it is indeed beneficial for the seller to set a longer price protection period.
Keywords: Market; Analytics; and; Revenue; Management; dynamic pricing; online learning; price protection; exploration-exploitation tradeoff (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:
Downloads: (external link)
http://dx.doi.org/10.1287/opre.2022.0629 (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:inm:oropre:v:73:y:2025:i:2:p:775-797
Access Statistics for this article
More articles in Operations Research from INFORMS Contact information at EDIRC.
Bibliographic data for series maintained by Chris Asher ().