Data Mining of Text Documents

Triantaphyllou, Evangelos

Data Mining of Text Documents

Evangelos Triantaphyllou ()
Additional contact information
Evangelos Triantaphyllou: Louisiana State University

Chapter Chapter 13 in Data Mining and Knowledge Discovery via Logic-Based Methods, 2010, pp 257-276 from Springer

Abstract: Abstract This chapter investigates the problem of classifying sub text documents (mining of) text documents into two disjoint classes. It does so by employing a data mining approach based on the OCAT algorithm. This chapter is based on the work discussed in [ aut Nieto Sanchez, S. Nieto Sanchez, aut Triantaphyllou, E. Triantaphyllou, and Kraft, 2002]. In the present setting two sub sample set sample sets of training examples (text documents) are assumed to be available. An approach is developed that uses sub indexing terms, see keywords indexing terms to form patterns of logical expressions (Boolean functions) that next are used to classify new text documents (which are of unknown class). This is a typical case of sub supervised classification supervised “crisp” classification.

Keywords: Cross Validation; Boolean Function; Wall Street Journal; Text Document; Vector Space Model (search for similar items in EconPapers)
Date: 2010
References: Add references at CitEc
Citations:

There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:spr:spochp:978-1-4419-1630-3_13

Ordering information: This item can be ordered from
http://www.springer.com/9781441916303

DOI: 10.1007/978-1-4419-1630-3_13

Access Statistics for this chapter

More chapters in Springer Optimization and Its Applications from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().