EconPapers    
Economics at your fingertips  
 

PTF-SimCM: A Simple Contrastive Model with Polysemous Text Fusion for Visual Similarity Metric

Xinpan Yuan, Xinxin Mao, Wei Xia, Zhiqi Zhang, Shaojun Xie, Chengyuan Zhang and Zhen Zhang

Complexity, 2022, vol. 2022, 1-14

Abstract: Image similarity metric, also known as metric learning (ML) in computer vision, is a significant step in various advanced image tasks. Nevertheless, existing well-performing approaches for image similarity measurement only focus on the image itself without utilizing the information of other modalities, while pictures always appear with the described text. Furthermore, those methods need human supervision, yet most images are unlabeled in the real world. Considering the above problems comprehensively, we present a novel visual similarity metric model named PTF-SimCM. It adopts a self-supervised contrastive structure like SimSiam and incorporates a multimodal fusion module to utilize textual modality correlated to the image. We apply a cross-modal model for text modality rather than a standard unimodal text encoder to improve late fusion productivity. In addition, the proposed model employs Sentence PIE-Net to solve the issue caused by polysemous sentences. For simplicity and efficiency, our model learns a specific embedding space where distances directly correspond to the similarity. Experimental results on MSCOCO, Flickr 30k, and Pascal Sentence datasets show that our model overall outperforms all the compared methods in this work, which illustrates that the model can effectively address the issues faced and enhance the performances on unsupervised visual similarity measuring relatively.

Date: 2022
References: Add references at CitEc
Citations:

Downloads: (external link)
http://downloads.hindawi.com/journals/complexity/2022/2343707.pdf (application/pdf)
http://downloads.hindawi.com/journals/complexity/2022/2343707.xml (application/xml)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:hin:complx:2343707

DOI: 10.1155/2022/2343707

Access Statistics for this article

More articles in Complexity from Hindawi
Bibliographic data for series maintained by Mohamed Abdelhakeem ().

 
Page updated 2025-03-19
Handle: RePEc:hin:complx:2343707