Close the Gaps: A Learning-While-Doing Algorithm for Single-Product Revenue Management Problems

Wang, Zizhuo; Deng, Shiming; Ye, Yinyu

Close the Gaps: A Learning-While-Doing Algorithm for Single-Product Revenue Management Problems

Zizhuo Wang (), Shiming Deng () and Yinyu Ye ()
Additional contact information
Zizhuo Wang: Department of Industrial and Systems Engineering, University of Minnesota, Minneapolis, Minnesota 55455
Shiming Deng: School of Management, Huazhong University of Science and Technology, Wuhan 430074, China
Yinyu Ye: Department of Management Science and Engineering, Stanford University, Stanford, California 94305

Operations Research, 2014, vol. 62, issue 2, 318-331

Abstract: We consider a retailer selling a single product with limited on-hand inventory over a finite selling season. Customer demand arrives according to a Poisson process, the rate of which is influenced by a single action taken by the retailer (such as price adjustment, sales commission, advertisement intensity, etc.). The relationship between the action and the demand rate is not known in advance. However, the retailer is able to learn the optimal action on the fly as she maximizes her total expected revenue based on the observed demand reactions.Using the pricing problem as an example, we propose a dynamic learning-while-doing algorithm that only involves function value estimation to achieve a near-optimal performance . Our algorithm employs a series of shrinking price intervals and iteratively tests prices within that interval using a set of carefully chosen parameters. We prove that the performance of our algorithm is among the best of all possible algorithms in terms of the asymptotic regret (the relative loss compared to the full information optimal solution). Our result closes the performance gaps between parametric and nonparametric learning and between the post-price mechanism and the customer-bidding mechanism. Important managerial insight from this research is that the values of information on both the parametric form of the demand function as well as each customer's exact reservation price are less important than prior literature suggests. Our results also suggest that firms would be better off to perform dynamic learning and action concurrently rather than sequentially.

Keywords: revenue management; pricing; nonparametric; learning; asymptotic optimality; dynamic decision making (search for similar items in EconPapers)
Date: 2014
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (50)

Downloads: (external link)
http://dx.doi.org/10.1287/opre.2013.1245 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:inm:oropre:v:62:y:2014:i:2:p:318-331

Access Statistics for this article

More articles in Operations Research from INFORMS Contact information at EDIRC.
Bibliographic data for series maintained by Chris Asher ().