Using psycholinguistic features for profiling first language of authors
Rosemary Torney,
Peter Vamplew and
John Yearwood
Journal of the American Society for Information Science and Technology, 2012, vol. 63, issue 6, 1256-1269
Abstract:
This study empirically evaluates the effectiveness of different feature types for the classification of the first language of an author. In particular, it examines the utility of psycholinguistic features, extracted by the Linguistic Inquiry and Word Count (LIWC) tool, that have not previously been applied to the task of author profiling. As LIWC is a tool that has been developed in the psycholinguistic field rather than the computational linguistics field, it was hypothesized that it would be effective, both as a single type feature set because of its psycholinguistic basis, and in combination with other feature sets, because it should be sufficiently different to add insight rather than redundancy. It was found that LIWC features were competitive with previously used feature types in identifying the first language of an author, and that combined feature sets including LIWC features consistently showed better accuracy rates and average F measures than were achieved by the same feature sets without the LIWC features. As a secondary issue, this study also examined how effectively first language classification scaled up to a larger number of possible languages. It was found that the classification scheme scaled up effectively to the entire 16 language collection from the International Corpus of Learner English, when compared with results achieved on just 5 languages in previous research.
Date: 2012
References: Add references at CitEc
Citations:
Downloads: (external link)
https://doi.org/10.1002/asi.22627
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:bla:jamist:v:63:y:2012:i:6:p:1256-1269
Ordering information: This journal article can be ordered from
https://doi.org/10.1002/(ISSN)1532-2890
Access Statistics for this article
More articles in Journal of the American Society for Information Science and Technology from Association for Information Science & Technology
Bibliographic data for series maintained by Wiley Content Delivery ().