Text Mining in different languages

Lebart, Ludovic

Text Mining in different languages

Ludovic Lebart

Applied Stochastic Models and Data Analysis, 1998, vol. 14, issue 4, 323-334

Abstract: The purpose of Text Mining is to describe and explore textual data, to uncover structural traits, and proceed to predictions. The field of application concerns Information Retrieval, processing responses to open‐ended questions in sample surveys as well as processing textual corpora of a more general nature. At the intersection of Corpora Linguistics and Exploratory Statistical Analysis, a series of language independent tools and methods can perform most of the previously mentioned tasks, including the assessment and validation of the obtained results, be it visualization or categorization. Multiple confusion matrices calculated on test‐samples characterize the quality of the prediction as well as the structure of errors of prediction. In the case of multinational surveys and corpora, they allow us to proceed to comparisons among several countries, in spite of the very heterogeneous character of the basic information (texts in different languages). Copyright © 1998 John Wiley & Sons, Ltd.

Date: 1998
References: Add references at CitEc
Citations:

Downloads: (external link)
https://doi.org/10.1002/(SICI)1099-0747(199812)14:43.0.CO;2-0

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:wly:apsmda:v:14:y:1998:i:4:p:323-334

Access Statistics for this article

More articles in Applied Stochastic Models and Data Analysis from John Wiley & Sons
Bibliographic data for series maintained by Wiley Content Delivery ().