Data Mining of Text Documents
Evangelos Triantaphyllou ()
Additional contact information
Evangelos Triantaphyllou: Louisiana State University
Chapter Chapter 13 in Data Mining and Knowledge Discovery via Logic-Based Methods, 2010, pp 257-276 from Springer
Abstract:
Abstract This chapter investigates the problem of classifying sub text documents (mining of) text documents into two disjoint classes. It does so by employing a data mining approach based on the OCAT algorithm. This chapter is based on the work discussed in [ aut Nieto Sanchez, S. Nieto Sanchez, aut Triantaphyllou, E. Triantaphyllou, and Kraft, 2002]. In the present setting two sub sample set sample sets of training examples (text documents) are assumed to be available. An approach is developed that uses sub indexing terms, see keywords indexing terms to form patterns of logical expressions (Boolean functions) that next are used to classify new text documents (which are of unknown class). This is a typical case of sub supervised classification supervised “crisp” classification.
Keywords: Cross Validation; Boolean Function; Wall Street Journal; Text Document; Vector Space Model (search for similar items in EconPapers)
Date: 2010
References: Add references at CitEc
Citations:
There are no downloads for this item, see the EconPapers FAQ for hints about obtaining it.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:spochp:978-1-4419-1630-3_13
Ordering information: This item can be ordered from
http://www.springer.com/9781441916303
DOI: 10.1007/978-1-4419-1630-3_13
Access Statistics for this chapter
More chapters in Springer Optimization and Its Applications from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().