Text Mining in different languages
Ludovic Lebart
Applied Stochastic Models and Data Analysis, 1998, vol. 14, issue 4, 323-334
Abstract:
The purpose of Text Mining is to describe and explore textual data, to uncover structural traits, and proceed to predictions. The field of application concerns Information Retrieval, processing responses to open‐ended questions in sample surveys as well as processing textual corpora of a more general nature. At the intersection of Corpora Linguistics and Exploratory Statistical Analysis, a series of language independent tools and methods can perform most of the previously mentioned tasks, including the assessment and validation of the obtained results, be it visualization or categorization. Multiple confusion matrices calculated on test‐samples characterize the quality of the prediction as well as the structure of errors of prediction. In the case of multinational surveys and corpora, they allow us to proceed to comparisons among several countries, in spite of the very heterogeneous character of the basic information (texts in different languages). Copyright © 1998 John Wiley & Sons, Ltd.
Date: 1998
References: Add references at CitEc
Citations:
Downloads: (external link)
https://doi.org/10.1002/(SICI)1099-0747(199812)14:43.0.CO;2-0
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:wly:apsmda:v:14:y:1998:i:4:p:323-334
Access Statistics for this article
More articles in Applied Stochastic Models and Data Analysis from John Wiley & Sons
Bibliographic data for series maintained by Wiley Content Delivery ().