Attribute Sentiment Scoring With Online Text Reviews: Accounting for Language Structure and Attribute Self-Selection

Chakraborty, Ishita; Kim, Minkyung; Sudhir, K.

Attribute Sentiment Scoring With Online Text Reviews: Accounting for Language Structure and Attribute Self-Selection

Ishita Chakraborty, Minkyung Kim and K. Sudhir ()
Additional contact information
Ishita Chakraborty: School of Management, Yale University
Minkyung Kim: School of Management, Yale University
K. Sudhir: Cowles Foundation & School of Management, Yale University, https://faculty.som.yale.edu/ksudhir/

No 2176R2, Cowles Foundation Discussion Papers from Cowles Foundation for Research in Economics, Yale University

Abstract: The authors address two significant challenges in using online text reviews to obtain finegrained attribute level sentiment ratings. First, in contrast to methods that rely on word frequency, they develop a deep learning convolutional-LSTM hybrid model to account for language structure. The convolutional layer accounts for spatial structure (adjacent word groups or phrases) and LSTM accounts for sequential structure of language (sentiment distributed and modified across non-adjacent phrases). Second, they address the problem of missing attributes in text in constructing attribute sentiment scores'as reviewers write only about a subset of attributes and remain silent on others. They develop a model-based imputation strategy using a structural model of heterogeneous rating behavior. Using Yelp restaurant review data, they show superior attribute sentiment scoring accuracy with their model. They find three reviewer segments with different motivations: status seeking, altruism/want voice, and need to vent/praise. Reviewers write to inform and vent/praise, but not based on attribute importance. The heterogeneous model-based imputation performs better than other common imputations; and importantly leads to managerially significant corrections in restaurant attribute ratings. More broadly, our results suggest that social science research should pay more attention to reduce measurement error in variables constructed from text.

Keywords: Text mining; Natural language processing (NLP); Convolutional neural networks (CNN); Long-short term memory (LSTM) Networks; Deep learning; Lexicons; Endogeneity; Self-selection; Online reviews; Online ratings; Customer satisfaction (search for similar items in EconPapers)
JEL-codes: C5 C8 M1 M3 (search for similar items in EconPapers)
Pages: 66 pages
Date: 2019-05, Revised 2021-06
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (3)

Downloads: (external link)
https://cowles.yale.edu/sites/default/files/files/pub/d21/d2176-r2.pdf (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:cwl:cwldpp:2176r2

Ordering information: This working paper can be ordered from
Cowles Foundation, Yale University, Box 208281, New Haven, CT 06520-8281 USA
The price is None.

Access Statistics for this paper

More papers in Cowles Foundation Discussion Papers from Cowles Foundation for Research in Economics, Yale University Yale University, Box 208281, New Haven, CT 06520-8281 USA. Contact information at EDIRC.
Bibliographic data for series maintained by Brittany Ladd ().