EconPapers    
Economics at your fingertips  
 

Optimizing document management and retrieval with multimodal transformers and knowledge graphs

Yali Chen, Bin Hu and Yajuan Liu

PLOS ONE, 2025, vol. 20, issue 6, 1-27

Abstract: In the digital age, multimodal archival data is experiencing explosive growth, and how to efficiently and accurately retrieve information from it has become a key challenge. Traditional retrieval methods struggle to effectively handle multi-source heterogeneous multimodal data, leading to poor retrieval accuracy and efficiency. To address this issue, this paper proposes the MDKG-RL model, which organically integrates knowledge graph reasoning, deep reinforcement learning dynamic optimization, and multimodal Transformer architecture to achieve deep semantic understanding of multimodal data and intelligent optimization of retrieval strategies. The experiments, based on the ICDAR 2023 and AIDA Corpus datasets, show that MDKG-RL achieves a mean reciprocal rank (MRR) of 0.85, a normalized discounted cumulative gain (NDCG) of 0.88, and an entity linking accuracy of 92.4%. Compared to the baseline model, MRR improves by 13.3%, NDCG increases by 12.8%, and response time is reduced by 38.2%, significantly outperforming other comparison models. Ablation experiments also confirm the indispensability of each module. Visual analysis further demonstrates the model’s clear advantages in retrieval accuracy and efficiency, though error analysis reveals its shortcomings in handling long-tail entities and cross-modal ambiguity. The MDKG-RL model provides an innovative and effective solution for multimodal archival retrieval, not only improving retrieval performance but also laying the foundation for future research. In the future, model performance and generalization capabilities can be further enhanced by expanding data, optimizing strategies, and extending application scenarios, thereby promoting the development and application of multimodal retrieval technology in the fields of information management and knowledge discovery.

Date: 2025
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0323966 (text/html)
https://journals.plos.org/plosone/article/file?id= ... 23966&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pone00:0323966

DOI: 10.1371/journal.pone.0323966

Access Statistics for this article

More articles in PLOS ONE from Public Library of Science
Bibliographic data for series maintained by plosone ().

 
Page updated 2025-06-21
Handle: RePEc:plo:pone00:0323966