How Small is Big Enough? Open Labeled Datasets and the Development of Deep Learning

Souza, Daniel; Geuna, Aldo; RodrÃ­guez, Jeff

How Small is Big Enough? Open Labeled Datasets and the Development of Deep Learning

Daniel Souza, Aldo Geuna and Jeff RodrÃguez

Carlo Alberto Notebooks from Collegio Carlo Alberto

Abstract: We investigate the emergence of Deep Learning as a technoscientific field, emphasizing the role of open labeled datasets. Through qualitative and quantitative analyses, we evaluate the role of datasets like CIFAR-10 in advancing computer vision and object recognition, which are central to the Deep Learning revolution. Our findings highlight CIFAR-10â€™s crucial role and enduring influence on the field, as well as its importance in teaching ML techniques. Results also indicate that dataset characteristics such as size, number of instances, and number of categories, were key factors. Econometric analysis confirms that CIFAR-10, a small-but- sufficiently-large open dataset, played a significant and lasting role in technological advancements and had a major function in the development of the early scientific literature as shown by citation metrics.

Keywords: Artificial Intelligence; Deep Learning; Emergence of technosciences; Open science; Open Labeled Datasets (search for similar items in EconPapers)
Pages: 59 pages
Date: 2025
New Economics Papers: this item is included in nep-big, nep-cmp and nep-his
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.carloalberto.org/wp-content/uploads/2025/05/738.pdf (application/pdf)

Related works:
Journal Article: How small is big enough? Open labeled datasets and the development of deep learning (2025)
Working Paper: How Small is Big Enough? Open Labeled Datasets and the Development of Deep Learning (2024)
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:cca:wpaper:738

Access Statistics for this paper

More papers in Carlo Alberto Notebooks from Collegio Carlo Alberto Contact information at EDIRC.
Bibliographic data for series maintained by Giovanni Bert ().