EconPapers    
Economics at your fingertips  
 

txttool: Utilities for text analysis in Stata

Unislawa Williams () and Sean P. Williams ()
Additional contact information
Unislawa Williams: Spelman College
Sean P. Williams: SunTrust Bank

Stata Journal, 2014, vol. 14, issue 4, 817-829

Abstract: This article describes txttool, a command that provides a set of tools for managing free-form text. The command integrates several built-in Stata functions with new text capabilities. These latter functions include a utility to create a bag-of-words representation of text and an implementation of Porter’s (1980, Program: Electronic library and information systems 14: 130–137) wordstemming algorithm. Collectively, these utilities provide a text-processing suite for text mining and other text-based applications in Stata. Copyright 2014 by StataCorp LP.

Keywords: txttool; text mining; Porter stemmer; bag of words; cleaning; stop words; subwords (search for similar items in EconPapers)
Date: 2014
Note: to access software from within Stata, net describe http://www.stata-journal.com/software/sj14-4/dm0077/
References: Add references at CitEc
Citations: View citations in EconPapers (2)

Downloads: (external link)
http://www.stata-journal.com/article.html?article=dm0077 link to article purchase

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:tsj:stataj:v:14:y:2014:i:4:p:817-829

Ordering information: This journal article can be ordered from
http://www.stata-journal.com/subscription.html

Access Statistics for this article

Stata Journal is currently edited by Nicholas J. Cox and Stephen P. Jenkins

More articles in Stata Journal from StataCorp LLC
Bibliographic data for series maintained by Christopher F. Baum () and Lisa Gilmore ().

 
Page updated 2025-03-20
Handle: RePEc:tsj:stataj:v:14:y:2014:i:4:p:817-829