What Did I Forget? Basket Analysis for Large Assortments Using Transformers
Luuk van Maasakkers,
Bas Donkers and
Dennis Fok
Additional contact information
Luuk van Maasakkers: Erasmus University Rotterdam
Bas Donkers: Erasmus University Rotterdam
Dennis Fok: Erasmus University Rotterdam
No 25-071/XII, Tinbergen Institute Discussion Papers from Tinbergen Institute
Abstract:
We propose a new method for learning product complementarity patterns in shopping baskets, inspired by Google's Bidirectional Encoder Representations from Transformers (BERT) for natural language processing. We reformulate BERT's masked learning task in a marketing context and learn to accurately identify missing products from a real-life grocery shopping basket based on the other products purchased in that same basket. The resulting model, which we call BaskERT, can be used by retailers for personalized product recommendations and for analyzing product complementarity patterns across the assortment. BaskERT outperforms several state-of-the-art benchmarks in a basket completion task. Different procedures for sampling the missing product during training impact the variety of recommendations returned by the model. This enables marketers to steer their recommendations away from the most popular products. The model is easily scalable to large assortments. As our model only requires basket data from the current shopping trip, it is applicable in many situations, also when customer information and purchase history data are not available, for example because of privacy regulations.
Keywords: product basket prediction; machine learning; transformers; product embedding (search for similar items in EconPapers)
JEL-codes: M31 (search for similar items in EconPapers)
Date: 2025-12-11
References: Add references at CitEc
Citations:
Downloads: (external link)
https://papers.tinbergen.nl/25071.pdf (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:tin:wpaper:20250071
Access Statistics for this paper
More papers in Tinbergen Institute Discussion Papers from Tinbergen Institute Contact information at EDIRC.
Bibliographic data for series maintained by Tinbergen Office +31 (0)10-4088900 ().