EconPapers    
Economics at your fingertips  
 

Classification With Unstructured Predictors and an Application to Sentiment Analysis

Junhui Wang, Xiaotong Shen, Yiwen Sun and Annie Qu

Journal of the American Statistical Association, 2016, vol. 111, issue 515, 1242-1253

Abstract: Unstructured data refer to information that lacks certain structures and cannot be organized in a predefined fashion. Unstructured data often involve words, texts, graphs, objects, or multimedia types of files that are difficult to process and analyze with traditional computational tools and statistical methods. This work explores ordinal classification for unstructured predictors with ordered class categories, where imprecise information concerning strengths of association between predictors is available for predicting class labels. However, imprecise information here is expressed in terms of a directed graph, with each node representing a predictor and a directed edge containing pairwise strengths of association between two nodes. One of the targeted applications for unstructured data arises from sentiment analysis, which identifies and extracts the relevant content or opinion of a document concerning a specific event of interest. We integrate the imprecise predictor relations into linear relational constraints over classification function coefficients, where large margin ordinal classifiers are introduced, subject to many quadratically linear constraints. The proposed classifiers are then applied in sentiment analysis using binary word predictors. Computationally, we implement ordinal support vector machines and ψ-learning through a scalable quadratic programming package based on sparse word representations. Theoretically, we show that using relationships among unstructured predictors improves prediction accuracy of classification significantly. We illustrate an application for sentiment analysis using consumer text reviews and movie review data. Supplementary materials for this article are available online.

Date: 2016
References: View complete reference list from CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
http://hdl.handle.net/10.1080/01621459.2015.1089771 (text/html)
Access to full text is restricted to subscribers.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:taf:jnlasa:v:111:y:2016:i:515:p:1242-1253

Ordering information: This journal article can be ordered from
http://www.tandfonline.com/pricing/journal/UASA20

DOI: 10.1080/01621459.2015.1089771

Access Statistics for this article

Journal of the American Statistical Association is currently edited by Xuming He, Jun Liu, Joseph Ibrahim and Alyson Wilson

More articles in Journal of the American Statistical Association from Taylor & Francis Journals
Bibliographic data for series maintained by Chris Longhurst ().

 
Page updated 2025-03-20
Handle: RePEc:taf:jnlasa:v:111:y:2016:i:515:p:1242-1253