Deep Visual Semantic Embedding with Text Data Augmentation and Word Embedding Initialization
Hai He and
Haibo Yang
Mathematical Problems in Engineering, 2021, vol. 2021, 1-8
Abstract:
Language and vision are the two most essential parts of human intelligence for interpreting the real world around us. How to make connections between language and vision is the key point in current research. Multimodality methods like visual semantic embedding have been widely studied recently, which unify images and corresponding texts into the same feature space. Inspired by the recent development of text data augmentation and a simple but powerful technique proposed called EDA (easy data augmentation), we can expand the information with given data using EDA to improve the performance of models. In this paper, we take advantage of the text data augmentation technique and word embedding initialization for multimodality retrieval. We utilize EDA for text data augmentation, word embedding initialization for text encoder based on recurrent neural networks, and minimizing the gap between the two spaces by triplet ranking loss with hard negative mining. On two Flickr-based datasets, we achieve the same recall with only 60% of the training dataset as the normal training with full available data. Experiment results show the improvement of our proposed model; and, on all datasets in this paper (Flickr8k, Flickr30k, and MS-COCO), our model performs better on image annotation and image retrieval tasks; the experiments also demonstrate that text data augmentation is more suitable for smaller datasets, while word embedding initialization is suitable for larger ones.
Date: 2021
References: Add references at CitEc
Citations:
Downloads: (external link)
http://downloads.hindawi.com/journals/MPE/2021/6654071.pdf (application/pdf)
http://downloads.hindawi.com/journals/MPE/2021/6654071.xml (text/xml)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:hin:jnlmpe:6654071
DOI: 10.1155/2021/6654071
Access Statistics for this article
More articles in Mathematical Problems in Engineering from Hindawi
Bibliographic data for series maintained by Mohamed Abdelhakeem ().