EconPapers    
Economics at your fingertips  
 

Topic classification of vietnamese product reviews in e-commerce using PhoBERT

Tuan Duy Nguyen (), Duc Minh Nguyen (), Huu Manh Nguyen () and Thi Quynh Giang Nguyen ()
Additional contact information
Tuan Duy Nguyen: Department of Mathematical Economics, National Economics University
Duc Minh Nguyen: Department of Mathematical Economics, National Economics University
Huu Manh Nguyen: Deakin University
Thi Quynh Giang Nguyen: Department of Mathematical Economics, National Economics University

Journal of Marketing Analytics, 2025, vol. 13, issue 2, No 6, 385 pages

Abstract: Abstract Integrating customer feedback into product refinement is critical for businesses aiming to improve product offerings, enhance customer experiences, and boost revenue. While user-generated reviews on e-commerce platforms provide timely and unbiased feedback, the unstructured nature and high volume of such textual data pose significant challenges for extracting actionable insights. This study investigates the application of natural language processing (NLP) techniques, specifically topic modeling and text classification, to address this issue. In the first phase of the study, we employed Latent Dirichlet Allocation (LDA) and BERTopic to identify latent topics within customer reviews, providing a thematic overview of customer discussions. Due to the bad performance of these models, manual labeling was introduced in the second phase based on the topics identified in the initial step. The final classification model was developed using PhoBERT embeddings combined with Logistic Regression. The experiments were conducted on a dataset of 17,002 Vietnamese customer reviews from Shopee, one of Vietnam’s largest e-commerce platforms. The model successfully categorized reviews into five primary topics: Product Quality, Customer Service, Price, Shipping, and Packaging, achieving an F1-score of 0.96 and a Hamming Loss of 0.022. This result helps e-commerce managers identify customer issues, allowing for effective improvements in the buying experience.

Keywords: E-commerce reviews analysis; Natural language processing (NLP); Text classification; Topic modeling (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:

Downloads: (external link)
http://link.springer.com/10.1057/s41270-025-00402-w Abstract (text/html)
Access to the full text of the articles in this series is restricted.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:pal:jmarka:v:13:y:2025:i:2:d:10.1057_s41270-025-00402-w

Ordering information: This journal article can be ordered from
http://www.springer. ... gement/journal/41270

DOI: 10.1057/s41270-025-00402-w

Access Statistics for this article

Journal of Marketing Analytics is currently edited by Maria Petrescu and Anjala Krishnen

More articles in Journal of Marketing Analytics from Palgrave Macmillan
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-06-21
Handle: RePEc:pal:jmarka:v:13:y:2025:i:2:d:10.1057_s41270-025-00402-w