Representing the Twittersphere: Archiving a representative sample of Twitter data under resource constraints
Airo Hino and
Robert A. Fahey
International Journal of Information Management, 2019, vol. 48, issue C, 175-184
Abstract:
The rising popularity of social media posts, most notably Twitter posts, as a data source for social science research poses significant problems with regard to access to representative, high-quality data for analysis. Cheap, publicly available data such as that obtained from Twitter's public application programming interfaces is often of low quality, while high-quality data is expensive both financially and computationally. Moreover, data is often available only in real-time, making post-hoc analysis difficult or impossible. We propose and test a methodology for inexpensively creating an archive of Twitter data through population sampling, yielding a database that is highly representative of the targeted user population (in this test case, the entire population of Japanese-language Twitter users). Comparing the tweet volume, keywords, and topics found in our sample data set with the ground truth of Twitter's full data feed confirmed a very high degree of representativeness in the sample. We conclude that this approach yields a data set that is suitable for a wide range of post-hoc analyses, while remaining cost effective and accessible to a wide range of researchers.
Keywords: Twitter; Social media; Sampling; Representativeness; Data collection (search for similar items in EconPapers)
Date: 2019
References: Add references at CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0268401218306005
Full text for ScienceDirect subscribers only
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:eee:ininma:v:48:y:2019:i:c:p:175-184
DOI: 10.1016/j.ijinfomgt.2019.01.019
Access Statistics for this article
International Journal of Information Management is currently edited by Yogesh K. Dwivedi
More articles in International Journal of Information Management from Elsevier
Bibliographic data for series maintained by Catherine Liu ().