Mining Twitter data for fun and profit
Joseph Canner (jcanner1@jhmi.edu) and
Neeraja Nagarajan
Additional contact information
Joseph Canner: Johns Hopkins University School of Medicine
Neeraja Nagarajan: Johns Hopkins University School of Medicine
2016 Stata Conference from Stata Users Group
Abstract:
Twitter feed data has increasingly become a rich source of information, both for commercial marketing purposes and for social science research. In the early days of Twitter, researchers could access the Twitter API with a simple URL. In order to control the size of data requests and to track who is accessing what data, Twitter has instituted some security policies that make this much more difficult for the average researcher. Users must obtain unique keys from Twitter, create a timestamp, generate a random string, combine this with the actual data request, hash this string using HMAC-SHA1, and submit all of this to the Twitter API in a timely fashion. Large requests must be divided into multiple smaller requests and spread out over a period of time. Our particular need was to obtain Twitter user profile information for over 800 users who had previously tweeted about surgery. We developed a Stata command to automate this process. It includes a number of bitwise operator functions and other tools that could be useful in other applications. We will discuss the unique features of this command, the tools required to implement it, and the feasibility of extending this example to other data request types.
Date: 2016-08-10
New Economics Papers: this item is included in nep-pay
References: Add references at CitEc
Citations:
Downloads: (external link)
http://fmwww.bc.edu/repec/chic2016/chicago16_canner.pptx
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:boc:scon16:26
Access Statistics for this paper
More papers in 2016 Stata Conference from Stata Users Group Contact information at EDIRC.
Bibliographic data for series maintained by Christopher F Baum (baum@bc.edu).