EconPapers    
Economics at your fingertips  
 

Customer Acquisition via Display Advertising Using Multi-Armed Bandit Experiments

Eric M. Schwartz (), Eric T. Bradlow () and Peter S. Fader ()
Additional contact information
Eric M. Schwartz: Stephen M. Ross School of Business, University of Michigan, Ann Arbor, Michigan 48109
Eric T. Bradlow: The Wharton School, University of Pennsylvania, Philadelphia, Pennsylvania 19104
Peter S. Fader: The Wharton School, University of Pennsylvania, Philadelphia, Pennsylvania 19104

Marketing Science, 2017, vol. 36, issue 4, 500-522

Abstract: Firms using online advertising regularly run experiments with multiple versions of their ads since they are uncertain about which ones are most effective. During a campaign, firms try to adapt to intermediate results of their tests, optimizing what they earn while learning about their ads. Yet how should they decide what percentage of impressions to allocate to each ad? This paper answers that question, resolving the well-known “learn-and-earn” trade-off using multi-armed bandit (MAB) methods. The online advertiser’s MAB problem, however, contains particular challenges, such as a hierarchical structure (ads within a website), attributes of actions (creative elements of an ad), and batched decisions (millions of impressions at a time), that are not fully accommodated by existing MAB methods. Our approach captures how the impact of observable ad attributes on ad effectiveness differs by website in unobserved ways, and our policy generates allocations of impressions that can be used in practice. We implemented this policy in a live field experiment delivering over 750 million ad impressions in an online display campaign with a large retail bank. Over the course of two months, our policy achieved an 8% improvement in the customer acquisition rate, relative to a control policy, without any additional costs to the bank. Beyond the actual experiment, we performed counterfactual simulations to evaluate a range of alternative model specifications and allocation rules in MAB policies. Finally, we show that customer acquisition would decrease by about 10% if the firm were to optimize click-through rates instead of conversion directly, a finding that has implications for understanding the marketing funnel.

Keywords: multi-armed bandit; online advertising; field experiments; A/B testing; adaptive experiments; sequential decision making; explore-exploit; earning-and-learning; reinforcement learning; hierarchical models; machine learning (search for similar items in EconPapers)
Date: 2017
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (48)

Downloads: (external link)
https://doi.org/10.1287/mksc.2016.1023 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:inm:ormksc:v:36:y:2017:i:4:p:500-522

Access Statistics for this article

More articles in Marketing Science from INFORMS Contact information at EDIRC.
Bibliographic data for series maintained by Chris Asher ().

 
Page updated 2025-03-19
Handle: RePEc:inm:ormksc:v:36:y:2017:i:4:p:500-522