EconPapers    
Economics at your fingertips  
 

Research on Feature Fusion and Multimodal Patent Text Based on Graph Attention Network

Zhenzhen Song, Ziwei Liu and Hongji Li

Journal of Computer, Signal, and System Research, 2026, vol. 3, issue 1, 93-100

Abstract: Aiming at the challenges of cross-modal feature fusion, low computational efficiency in long patent text modeling, and insufficient hierarchical semantic coherence in patent text semantic mining, this study proposes a novel deep learning framework termed HGM-Net. The framework integrates Hierarchical Comparative Learning (HCL), a Multi-modal Graph Attention Network (M-GAT), and Multi-Granularity Sparse Attention (MSA) to achieve robust, efficient, and semantically consistent patent representation learning. Specifically, HCL introduces dynamic masking, contrastive learning, and cross-structural similarity constraints across word-, sentence-, and paragraph-level hierarchies, enabling the model to jointly capture fine-grained local semantics and high-level thematic consistency. Contrastive and cross-structural similarity constraints are particularly enforced at the word and paragraph levels, effectively enhancing semantic discrimination and global coherence within complex patent documents. Furthermore, M-GAT models patent classification codes, citation relationships, and textual semantics as heterogeneous graph structures, and employs cross-modal gated attention mechanisms to dynamically fuse multi-source and multi-modal features, thereby improving representation completeness and robustness. To address the high computational cost of long-text processing, MSA adopts a hierarchical sparse attention strategy that selectively allocates attention across multiple granularities, including words, phrases, sentences, and paragraphs, significantly reducing computational overhead while preserving critical semantic information. Extensive experimental evaluations on patent classification and similarity matching tasks demonstrate that HGM-Net consistently outperforms existing state-of-the-art deep learning approaches. The results validate the effectiveness and generalization capability of the proposed framework, highlighting its theoretical innovation and practical value in improving patent examination efficiency and enabling large-scale technology relevance mining.

Keywords: hierarchical comparative learning; multimodal graph attention networks; multi-granularity sparse attention; patent semantic mining (search for similar items in EconPapers)
Date: 2026
References: Add references at CitEc
Citations:

Downloads: (external link)
https://www.gbspress.com/index.php/JCSSR/article/view/582/596 (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:dbb:jcssra:v:3:y:2026:i:1:p:93-100

Access Statistics for this article

More articles in Journal of Computer, Signal, and System Research from George Brown Press
Bibliographic data for series maintained by Guangyi Li ().

 
Page updated 2026-02-11
Handle: RePEc:dbb:jcssra:v:3:y:2026:i:1:p:93-100