EconPapers    
Economics at your fingertips  
 

Curating Training Data for Reliable Large-Scale Visual Data Analysis: Lessons from Identifying Trash in Street View Imagery

Jackelyn Hwang, Nima Dahir, Mayuka Sarukkai and Gabby Wright

Sociological Methods & Research, 2023, vol. 52, issue 3, 1155-1200

Abstract: Visual data have dramatically increased in quantity in the digital age, presenting new opportunities for social science research. However, the extensive time and labor costs to process and analyze these data with existing approaches limit their use. Computer vision methods hold promise but often require large and nonexistent training data to identify sociologically relevant variables. We present a cost-efficient method for curating training data that utilizes simple tasks and pairwise comparisons to interpret and analyze visual data at scale using computer vision. We apply our approach to the detection of trash levels across space and over time in millions of street-level images in three physically distinct US cities. By comparing to ratings produced in a controlled setting and utilizing computational methods, we demonstrate generally high reliability in the method and identify sources that limit it. Altogether, this approach expands how visual data can be used at a large scale in sociology.

Keywords: computer vision; crowdsourcing; visual data; urban sociology; systematic social observation (search for similar items in EconPapers)
Date: 2023
References: Add references at CitEc
Citations:

Downloads: (external link)
https://journals.sagepub.com/doi/10.1177/00491241231171945 (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:sae:somere:v:52:y:2023:i:3:p:1155-1200

DOI: 10.1177/00491241231171945

Access Statistics for this article

More articles in Sociological Methods & Research
Bibliographic data for series maintained by SAGE Publications ().

 
Page updated 2025-03-19
Handle: RePEc:sae:somere:v:52:y:2023:i:3:p:1155-1200