Understanding Customers' Opinion using Web Scraping and Natural Language Processing
Alin-Gabriel Vaduva (),
Simona-Vasilica Oprea () and
Dragos-Catalin Barbu ()
Additional contact information
Alin-Gabriel Vaduva: The Bucharest University of Economic Studies, Department of Economic Informatics and Cybernetics, Romania
Simona-Vasilica Oprea: The Bucharest University of Economic Studies, Department of Economic Informatics and Cybernetics, Romania
Dragos-Catalin Barbu: The Bucharest University of Economic Studies, Department of Economic Informatics and Cybernetics, Romania
Ovidius University Annals, Economic Sciences Series, 2023, vol. XXIII, issue 1, 537-544
Abstract:
The web offers large volumes of data that is unstructured and fails to be further processed if not extracted and organized into local variables or into databases. In this paper, we aim to extract data from the Internet using web scraping and analyse it with Natural Language Processing (NLP). Our purpose is to understand customers’ opinions by extracting reviews and investigating them in Python. The positive or negative insight of the reviews, along with the word cloud offer additional tools to understand the customers, predict their behaviour and underpin problems signalled in the reviews. TextBlob and BERTweet are applied to analyse the reviews. To enhance the comprehension of the outcomes, a comparison is drawn between the classifications generated by the BERTweet model and those provided by the TextBlob API, a widely used Python library for performing various NLP tasks. Furthermore, the reviews are pre-processed to clean them from line breaks, punctuation characters etc. and a n-grams analysis is performed to better understand the positive and negative reviews. The frequency of the reviews displays the concrete problems faced by customers visiting the hotel in various seasons. It helps decision makers to take measures and improve the quality of the hotel services.
Keywords: web scraping; booking; customers opinions; natural language processing (search for similar items in EconPapers)
JEL-codes: C55 C81 Z13 (search for similar items in EconPapers)
Date: 2023
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://stec.univ-ovidius.ro/html/anale/RO/2023-i1/Section%203/38.pdf (application/pdf)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:ovi:oviste:v:xxiii:y:2023:i:1:p:537-544
Access Statistics for this article
Ovidius University Annals, Economic Sciences Series is currently edited by Spatariu Cerasela
More articles in Ovidius University Annals, Economic Sciences Series from Ovidius University of Constantza, Faculty of Economic Sciences Contact information at EDIRC.
Bibliographic data for series maintained by Gheorghiu Gabriela ().