Robust Learning of Consumer Preferences

Feng, Yifan; Caldentey, René; Ryan, Christopher Thomas

Robust Learning of Consumer Preferences

Yifan Feng (), René Caldentey () and Christopher Thomas Ryan ()
Additional contact information
Yifan Feng: National University of Singapore Business School, Singapore 119245, Singapore
René Caldentey: Booth School of Business, University of Chicago, Chicago, Illinois 60637
Christopher Thomas Ryan: Sauder School of Business, University of British Columbia, Vancouver, British Columbia V6T 1Z2, Canada

Operations Research, 2022, vol. 70, issue 2, 918-962

Abstract: This paper studies a class of ranking and selection problems faced by a company that wants to identify the most preferred product out of a finite set of alternatives when consumer preferences are a priori unknown. The only information available is that consumer preferences satisfy two key properties: (i) they are consistent with some unknown true ranking of the alternatives, and (ii) they are strict, namely, no two products are equally preferred. To learn the unknown ranking, the company is able to sample consumer preferences by sequentially showing different subsets of products to different consumers and asking them to report their top preference within the displayed set. The objective of the company is to design a display policy that minimizes the expected number of samples needed to identify the top-ranked product with high probability. We prove an instance-specific lower bound on the sample complexity of any policy that identifies the top-ranked product within a given (probabilistic) confidence. We also propose a robust formulation of the company’s problem and derive a sampling policy (myopic tracking policy), which is both worst-case asymptotically optimal and intuitive to implement. Roughly speaking, the myopic tracking policy randomly alternates between two extreme types of displaying strategies: (i) full display , which shows a consumer the entire menu so as to learn something about every product, and (ii) pair display , which shows a consumer only two products so as to maximize the informativeness of the choice made by the consumer. To assess the performance of our proposed myopic tracking policy, we conduct a comprehensive set of computational studies and compare it to alternative methods in the literature.

Keywords: Revenue Management and Market Analytics; sequential learning; maximum selection; best arm identification; dynamic assortments; preference learning (search for similar items in EconPapers)
Date: 2022
References: Add references at CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
http://dx.doi.org/10.1287/opre.2021.2157 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:inm:oropre:v:70:y:2022:i:2:p:918-962

Access Statistics for this article

More articles in Operations Research from INFORMS Contact information at EDIRC.
Bibliographic data for series maintained by Chris Asher ().