Assessing reliability of social media data: lessons from mining TripAdvisor hotel reviews
Zheng Xiang (),
Qianzhou Du (),
Yufeng Ma () and
Weiguo Fan ()
Additional contact information
Zheng Xiang: Virginia Tech
Qianzhou Du: Virginia Tech
Yufeng Ma: Virginia Tech
Weiguo Fan: Virginia Tech
Information Technology & Tourism, 2018, vol. 18, issue 1, No 4, 43-59
Abstract:
Abstract As an emerging research paradigm, big data analytics has been gaining currency in various fields. However, in existing hospitality and tourism literature there is scarcity of discussions on the quality of data which may impact the validity and generalizability of research findings. This study examines the reliability of online hotel reviews in TripAdvisor by developing a text classifier to predict travel purpose (i.e., business vs. leisure) based upon review textual contents. The classifier is tested over a range of cities and data sizes to examine its sensitivity to data samples. The findings show that, while the classifier’s performance is consistent across different cities, there are variations in response to data sizes and sampling methods. More importantly, a considerable amount of noise is found in the data, which leads to misclassification. Furthermore, a novel approach is developed to address the misclassification problem resulting from data noise. This study reveals important data quality issues and contributes to the theoretical development of social media analytics in hospitality and tourism.
Keywords: Big data; Data quality; Online hotel reviews; Social media analytics; Text classification; Methodology (search for similar items in EconPapers)
Date: 2018
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (4)
Downloads: (external link)
http://link.springer.com/10.1007/s40558-017-0098-z Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:infott:v:18:y:2018:i:1:d:10.1007_s40558-017-0098-z
Ordering information: This journal article can be ordered from
http://www.springer. ... ystems/journal/40558
DOI: 10.1007/s40558-017-0098-z
Access Statistics for this article
Information Technology & Tourism is currently edited by Zheng Xiang
More articles in Information Technology & Tourism from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().