EconPapers    
Economics at your fingertips  
 

Representing the Twittersphere: Archiving a representative sample of Twitter data under resource constraints

Airo Hino and Robert A. Fahey

International Journal of Information Management, 2019, vol. 48, issue C, 175-184

Abstract: The rising popularity of social media posts, most notably Twitter posts, as a data source for social science research poses significant problems with regard to access to representative, high-quality data for analysis. Cheap, publicly available data such as that obtained from Twitter's public application programming interfaces is often of low quality, while high-quality data is expensive both financially and computationally. Moreover, data is often available only in real-time, making post-hoc analysis difficult or impossible. We propose and test a methodology for inexpensively creating an archive of Twitter data through population sampling, yielding a database that is highly representative of the targeted user population (in this test case, the entire population of Japanese-language Twitter users). Comparing the tweet volume, keywords, and topics found in our sample data set with the ground truth of Twitter's full data feed confirmed a very high degree of representativeness in the sample. We conclude that this approach yields a data set that is suitable for a wide range of post-hoc analyses, while remaining cost effective and accessible to a wide range of researchers.

Keywords: Twitter; Social media; Sampling; Representativeness; Data collection (search for similar items in EconPapers)
Date: 2019
References: Add references at CitEc
Citations: View citations in EconPapers (1)

Downloads: (external link)
http://www.sciencedirect.com/science/article/pii/S0268401218306005
Full text for ScienceDirect subscribers only

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:eee:ininma:v:48:y:2019:i:c:p:175-184

DOI: 10.1016/j.ijinfomgt.2019.01.019

Access Statistics for this article

International Journal of Information Management is currently edited by Yogesh K. Dwivedi

More articles in International Journal of Information Management from Elsevier
Bibliographic data for series maintained by Catherine Liu ().

 
Page updated 2025-03-31
Handle: RePEc:eee:ininma:v:48:y:2019:i:c:p:175-184