Do you see what I see? Measuring the semantic differences in image‐recognition services' outputs
Anton Berg and
Matti Nelimarkka
Journal of the Association for Information Science & Technology, 2023, vol. 74, issue 11, 1307-1324
Abstract:
As scholars increasingly undertake large‐scale analysis of visual materials, advanced computational tools show promise for informing that process. One technique in the toolbox is image recognition, made readily accessible via Google Vision AI, Microsoft Azure Computer Vision, and Amazon's Rekognition service. However, concerns about such issues as bias factors and low reliability have led to warnings against research employing it. A systematic study of cross‐service label agreement concretized such issues: using eight datasets, spanning professionally produced and user‐generated images, the work showed that image‐recognition services disagree on the most suitable labels for images. Beyond supporting caveats expressed in prior literature, the report articulates two mitigation strategies, both involving the use of multiple image‐recognition services: Highly explorative research could include all the labels, accepting noisier but less restrictive analysis output. Alternatively, scholars may employ word‐embedding‐based approaches to identify concepts that are similar enough for their purposes, then focus on those labels filtered in.
Date: 2023
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
https://doi.org/10.1002/asi.24827
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:bla:jinfst:v:74:y:2023:i:11:p:1307-1324
Ordering information: This journal article can be ordered from
http://www.blackwell ... bs.asp?ref=2330-1635
Access Statistics for this article
More articles in Journal of the Association for Information Science & Technology from Association for Information Science & Technology
Bibliographic data for series maintained by Wiley Content Delivery ().