How to conduct high-quality psychology research using web scraping and APIs
Richard N. Landers,
Saron Demeke and
Vivien Lee
Chapter 16 in How to Conduct and Publish High-Quality Research in Industrial-Organizational Psychology, 2025, pp 244-256 from Edward Elgar Publishing
Abstract:
Abstract: Scraping data from the internet has become a powerful, popular approach to collect a wide variety of behavioral data quickly and easily, especially in psychology. In this chapter, we guide researchers through both the conceptual challenges of conducting observational research using web-based data as well as the technical challenges of engineering software solutions using R to collect web-based data algorithmically. We first discuss the creation and testing of data source theories: formal, structured statements about necessary assumptions to justify use of a particular source of web-based data in relation to internal and external validity threats. Second, we provide four technical scenarios that researchers should match to determine the programming complexity that will be necessary to execute their scraping project. By developing both of these skills, this chapter prepares researchers to collect rich, massive, thoughtfully constructed datasets using social media data and beyond while maximizing the generalizability of conclusions drawn from them.
Keywords: Web scraping; Data source theory; External validity; API; Reddit; SelectorGadget (search for similar items in EconPapers)
Date: 2025
ISBN: 9781035307739
References: Add references at CitEc
Citations:
Downloads: (external link)
https://www.elgaronline.com/doi/10.4337/9781035307746.00025 (application/pdf)
Our link check indicates that this URL is bad, the error code is: 403 Forbidden
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:elg:eechap:22114_16
Ordering information: This item can be ordered from
http://www.e-elgar.com
Access Statistics for this chapter
More chapters in Chapters from Edward Elgar Publishing
Bibliographic data for series maintained by Jack Sweeney ().