Customers' Opinion Mining from Extensive Amount of Textual Reviews in Relation to Induced Knowledge Growth
Jan Žižka and
Arnošt Svoboda
Additional contact information
Jan Žižka: Department of Informatics, Faculty of Business and Economics, Mendel University in Brno, Zemědělská 1, 613 00 Brno, Czech Republic
Arnošt Svoboda: Department of Applied Mathematics and Computer Science, Faculty of Economics and Administration, Masaryk University, Žerotínovo nám. 617/9, 601 77 Brno, Czech Republic
Acta Universitatis Agriculturae et Silviculturae Mendelianae Brunensis, 2015, vol. 63, issue 6, 2229-2237
Abstract:
Customers of various services are often invited to type a summarizing review via an Internet portal. Such reviews, written in natural languages, are typically unstructured, giving also a numeric evaluation within the scale "good" and "bad." The more reviews, the better feedback can be acquired for improving the service. However, after accumulating massive data, the non-linearly growing processing complexity may exceed the computational abilities to analyze the text contents. Decision tree inducers like c5 can reveal understandable knowledge from data but they need the data as a whole. This article describes an application of windowing, which is a technique for generating dataset subsamples that provide enough information for an inducer to train a classifier and get results similar to those achieved by training a model from the entire dataset. The windowing results, significantly reducing the complexity of the learning problem, are demonstrated using hundreds of thousands reviews written in English by hotel-service customers. A user obtains knowledge represented by significant words. The results show classification accuracy errors, training and testing time, tree sizes, and words relevant for the review meaning in dependence on the training subsample size. Finally, a method of suitable training-set size estimation is suggested.
Keywords: text mining; customer opinion analysis; decision trees; decision rules; windowing; large data volumes; machine learning; computational complexity; training-set size (search for similar items in EconPapers)
Date: 2015
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
http://acta.mendelu.cz/doi/10.11118/actaun201563062229.html (text/html)
http://acta.mendelu.cz/doi/10.11118/actaun201563062229.pdf (application/pdf)
free of charge
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:mup:actaun:actaun_2015063062229
DOI: 10.11118/actaun201563062229
Access Statistics for this article
Acta Universitatis Agriculturae et Silviculturae Mendelianae Brunensis is currently edited by Markéta Havlásková
More articles in Acta Universitatis Agriculturae et Silviculturae Mendelianae Brunensis from Mendel University Press
Bibliographic data for series maintained by Ivo Andrle ().