Modeling household online shopping demand in the U.S.: a machine learning approach and comparative investigation between 2009 and 2017
Limon Barua,
Bo Zou (),
Yan Zhou and
Yulin Liu
Additional contact information
Limon Barua: University of Illinois Chicago
Bo Zou: University of Illinois Chicago
Yan Zhou: Argonne National Laboratory
Yulin Liu: University of California
Transportation, 2023, vol. 50, issue 2, No 4, 437-476
Abstract:
Abstract Despite the rapid growth of online shopping and research interest in the relationship between online and in-store shopping, national-level modeling and investigation of the demand for online shopping with a prediction focus remain limited in the literature. This paper differs from prior work and leverages two recent releases of the U.S. National Household Travel Survey (NHTS) data for 2009 and 2017 to develop machine learning (ML) models, specifically gradient boosting machine (GBM), for predicting household-level online shopping purchases. The NHTS data allow for not only conducting nationwide investigation but also at the level of households, which is more appropriate than at the individual level given the connected consumption and shopping needs of members in a household. We follow a systematic procedure for model development including employing Recursive Feature Elimination algorithm to select input variables (features) in order to reduce the risk of model overfitting and increase model explainability. Among several ML models, GBM is found to yield the best prediction accuracy. Extensive post-modeling investigation is conducted in a comparative manner between 2009 and 2017, including quantifying the importance of each input variable in predicting online shopping demand, and characterizing value-dependent relationships between demand and the input variables. In doing so, two latest advances in machine learning techniques, namely Shapley value-based feature importance and Accumulated Local Effects plots, are adopted to overcome inherent drawbacks of the popular techniques in current ML modeling. The modeling and investigation are performed at the national level, with a number of findings obtained. The models developed and insights gained can be used for online shopping-related freight demand generation and may also be considered for evaluating the potential impact of relevant policies on online shopping demand.
Keywords: Online shopping demand; Gradient boosting machine; Prediction; National Household Travel Survey; Shapley value-based feature importance; Accumulated local effects (search for similar items in EconPapers)
Date: 2023
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (2)
Downloads: (external link)
http://link.springer.com/10.1007/s11116-021-10250-z Abstract (text/html)
Access to full text is restricted to subscribers.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:kap:transp:v:50:y:2023:i:2:d:10.1007_s11116-021-10250-z
Ordering information: This journal article can be ordered from
http://www.springer. ... ce/journal/11116/PS2
DOI: 10.1007/s11116-021-10250-z
Access Statistics for this article
Transportation is currently edited by Kay W. Axhausen
More articles in Transportation from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().