Distilling knowledge from multiple foundation models for zero-shot image classification

Yin, Siqi; Jiang, Lifan

Distilling knowledge from multiple foundation models for zero-shot image classification

Siqi Yin and Lifan Jiang

PLOS ONE, 2024, vol. 19, issue 9, 1-12

Abstract: Zero-shot image classification enables the recognition of new categories without requiring additional training data, thereby enhancing the model’s generalization capability when specific training are unavailable. This paper introduces a zero-shot image classification framework to recognize new categories that are unseen during training by distilling knowledge from foundation models. Specifically, we first employ ChatGPT and DALL-E to synthesize reference images of unseen categories from text prompts. Then, the test image is aligned with text and reference images using CLIP and DINO to calculate the logits. Finally, the predicted logits are aggregated according to their confidence to produce the final prediction. Experiments are conducted on multiple datasets, including MNIST, SVHN, CIFAR-10, CIFAR-100, and TinyImageNet. The results demonstrate that our method can significantly improve classification accuracy compared to previous approaches, achieving AUROC scores of over 96% across all test datasets. Our code is available at https://github.com/1134112149/MICW-ZIC.

Date: 2024
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0310730 (text/html)
https://journals.plos.org/plosone/article/file?id= ... 10730&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pone00:0310730

DOI: 10.1371/journal.pone.0310730

Access Statistics for this article

More articles in PLOS ONE from Public Library of Science
Bibliographic data for series maintained by plosone ().