txttool: Utilities for text analysis in Stata
Unislawa Williams () and
Sean P. Williams ()
Additional contact information
Unislawa Williams: Spelman College
Sean P. Williams: SunTrust Bank
Stata Journal, 2014, vol. 14, issue 4, 817-829
Abstract:
This article describes txttool, a command that provides a set of tools for managing free-form text. The command integrates several built-in Stata functions with new text capabilities. These latter functions include a utility to create a bag-of-words representation of text and an implementation of Porter’s (1980, Program: Electronic library and information systems 14: 130–137) wordstemming algorithm. Collectively, these utilities provide a text-processing suite for text mining and other text-based applications in Stata. Copyright 2014 by StataCorp LP.
Keywords: txttool; text mining; Porter stemmer; bag of words; cleaning; stop words; subwords (search for similar items in EconPapers)
Date: 2014
Note: to access software from within Stata, net describe http://www.stata-journal.com/software/sj14-4/dm0077/
References: Add references at CitEc
Citations: View citations in EconPapers (2)
Downloads: (external link)
http://www.stata-journal.com/article.html?article=dm0077 link to article purchase
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:tsj:stataj:v:14:y:2014:i:4:p:817-829
Ordering information: This journal article can be ordered from
http://www.stata-journal.com/subscription.html
Access Statistics for this article
Stata Journal is currently edited by Nicholas J. Cox and Stephen P. Jenkins
More articles in Stata Journal from StataCorp LLC
Bibliographic data for series maintained by Christopher F. Baum () and Lisa Gilmore ().