Real-time user clickstream behavior analysis based on apache storm streaming
Gautam Pal (),
Katie Atkinson and
Gangmin Li
Additional contact information
Gautam Pal: The University of Liverpool
Katie Atkinson: The University of Liverpool
Gangmin Li: University of Bedfordshire
Electronic Commerce Research, 2023, vol. 23, issue 3, No 20, 1829-1859
Abstract:
Abstract This paper presents an approach to analyzing consumers’ e-commerce site usage and browsing motifs through pattern mining and surfing behavior. User-generated clickstream is first stored in a client site browser. We build an ingestion pipeline to capture the high-velocity data stream from a client-side browser through Apache Storm, Kafka, and Cassandra. Given the consumer’s usage pattern, we uncover the user’s browsing intent through n-grams and Collocation methods. An innovative clustering technique is constructed through the Expectation-Maximization algorithm with Gaussian Mixture Model. We discuss a framework for predicting a user’s clicks based on the past click sequences through higher order Markov Chains. We developed our model on top of a big data Lambda Architecture which combines high throughput Hadoop batch setup with low latency real-time framework over a large distributed cluster. Based on this approach, we developed an experimental setup for an optimized Storm topology and enhanced Cassandra database latency to achieve real-time responses. The theoretical claims are corroborated with several evaluations in Microsoft Azure HDInsight Apache Storm deployment and in the Datastax distribution of Cassandra. The paper demonstrates that the proposed techniques help user experience optimization, building recently viewed products list, market-driven analyses, and allocation of website resources.
Keywords: Clickstream analytics; Real-time big data analytics; Real-time data ingestion; Apache storm; Cassandra; Datastax (search for similar items in EconPapers)
Date: 2023
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
http://link.springer.com/10.1007/s10660-021-09518-4 Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:elcore:v:23:y:2023:i:3:d:10.1007_s10660-021-09518-4
Ordering information: This journal article can be ordered from
http://www.springer.com/journal/10660
DOI: 10.1007/s10660-021-09518-4
Access Statistics for this article
Electronic Commerce Research is currently edited by James Westland
More articles in Electronic Commerce Research from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().