EconPapers    
Economics at your fingertips  
 

Assessing the Performance of Compression Based Clustering for Text Mining

Alexandra Cernian (), Dorin Carstoiu (), Adriana Olteanu () and Valentin Sgarciu ()
Additional contact information
Alexandra Cernian: “Politehnica” University of Bucharest
Dorin Carstoiu: “Politehnica” University of Bucharest
Adriana Olteanu: “Politehnica” University of Bucharest
Valentin Sgarciu: “Politehnica” University of Bucharest

ECONOMIC COMPUTATION AND ECONOMIC CYBERNETICS STUDIES AND RESEARCH, 2016, vol. 50, issue 2, 197-210

Abstract: The nature of the human brain is to find patterns in whatever surrounds us. Thus, we are all developing models of our personal universe. In an extended form, a constant preoccupation of philosophers has been to model the universe. Clustering is one of the most useful tools in the data mining process for discovering groups and identifying patterns in the underlying data. This paper addresses the compression based clustering approach and focuses on validating this method in the context of text mining. The idea is supported by the evidence that compression algorithms provide a good evaluation of the informational content. In this context, we developed an integrated clustering platform, called EasyClustering, which incorporates 3 compressors, 4 distance metrics and 3 clustering algorithms. The experimental validation presented in this paper focuses on clustering text documents based on informational content.

Keywords: clustering; compression; text mining; EasyClustering; FScore. (search for similar items in EconPapers)
JEL-codes: O30 (search for similar items in EconPapers)
Date: 2016
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
ftp://www.eadr.ro/RePEc/cys/ecocyb_pdf/ecocyb2_2016p197-210.pdf

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:cys:ecocyb:v:50:y:2016:i:2:p:197-210

Access Statistics for this article

ECONOMIC COMPUTATION AND ECONOMIC CYBERNETICS STUDIES AND RESEARCH is currently edited by Gheorghe RUXANDA

More articles in ECONOMIC COMPUTATION AND ECONOMIC CYBERNETICS STUDIES AND RESEARCH from Faculty of Economic Cybernetics, Statistics and Informatics Contact information at EDIRC.
Bibliographic data for series maintained by Corina Saman ().

 
Page updated 2025-03-19
Handle: RePEc:cys:ecocyb:v:50:y:2016:i:2:p:197-210