EconPapers    
Economics at your fingertips  
 

Learning semantic similarity from sentence pairs using hybrid features centric approach and explainable siamese neural networks

Weihong Zhao and Chunlu Hu

PLOS ONE, 2026, vol. 21, issue 5, 1-29

Abstract: Semantic embeddings play an important role in modern natural language processing because they help models understand meaning beyond individual words. Accurate text similarity is essential for many applications such as search, automated scoring, summarization, and question answering. However, existing methods based on Term Frequency-Inverse Document Frequency (TF-IDF) or simple lexical overlap often fail when sentences differ in length, structure, or word choice. These methods are less in performance, especially when working with short or medium-length sentences where meaning is expressed in different ways. This study explores sentence-level similarity using a Siamese BiLSTM model that learns deep semantic patterns and relationships between two sentences. The model captures contextual meaning, word interactions, and paraphrastic variations more effectively than traditional approaches. Experimental results show that the proposed model achieves the highest performance among machine-learning regressors, with lower errors and improved stability. Compared to TF-IDF, cosine similarity, and feature-based regressors, the Siamese model provides more accurate judgments of semantic closeness with RMSE of 0.16 and MAE of 0.107. Feature-level analysis using TF-IDF, Jaccard similarity, and embedding distances further supports these findings. Explainable AI techniques (SHAP, LIME) confirm model transparency by highlighting meaningful semantic cues and distributing attention across important linguistic features.

Date: 2026
References: Add references at CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0345540 (text/html)
https://journals.plos.org/plosone/article/file?id= ... 45540&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pone00:0345540

DOI: 10.1371/journal.pone.0345540

Access Statistics for this article

More articles in PLOS ONE from Public Library of Science
Bibliographic data for series maintained by plosone ().

 
Page updated 2026-05-24
Handle: RePEc:plo:pone00:0345540