EconPapers    
Economics at your fingertips  
 

Image Clustering: An Unsupervised Approach to Categorize Visual Data in Social Science Research

Han Zhang and Yilang Peng

Sociological Methods & Research, 2024, vol. 53, issue 3, 1534-1587

Abstract: Automated image analysis has received increasing attention in social scientific research, yet existing scholarship has mostly covered the application of supervised learning to classify images into predefined categories. This study focuses on the task of unsupervised image clustering, which aims to automatically discover categories from unlabelled image data. We first review the steps to perform image clustering and then focus on one key challenge in this task—finding intermediate representations of images. We present several methods of extracting intermediate image representations, including the bag-of-visual-words model, self-supervised learning, and transfer learning (in particular, feature extraction with pretrained models). We compare these methods using various visual datasets, including images related to protests in China from Weibo, images about climate change on Instagram, and profile images of the Russian Internet Research Agency on Twitter. In addition, we propose a systematic way to interpret and validate clustering solutions. Results show that transfer learning significantly outperforms the other methods. The dataset used in the pretrained model critically determines what categories the algorithms can discover.

Keywords: computational social sciecne; machine learning; visual data; image as data; computer vision; unsupervised learning; image clustering (search for similar items in EconPapers)
Date: 2024
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://journals.sagepub.com/doi/10.1177/00491241221082603 (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:sae:somere:v:53:y:2024:i:3:p:1534-1587

DOI: 10.1177/00491241221082603

Access Statistics for this article

More articles in Sociological Methods & Research
Bibliographic data for series maintained by SAGE Publications ().

 
Page updated 2025-03-19
Handle: RePEc:sae:somere:v:53:y:2024:i:3:p:1534-1587