EconPapers    
Economics at your fingertips  
 

TTG-Text: A Graph-Based Text Representation Framework Enhanced by Typical Testors for Improved Classification

Carlos Sánchez-Antonio, José E. Valdez-Rodríguez and Hiram Calvo ()
Additional contact information
Carlos Sánchez-Antonio: Cognitive Sciences Laboratory, Center for Computing Research, Instituto Politécnico Nacional, Mexico City 07738, Mexico
José E. Valdez-Rodríguez: Cognitive Sciences Laboratory, Center for Computing Research, Instituto Politécnico Nacional, Mexico City 07738, Mexico
Hiram Calvo: Cognitive Sciences Laboratory, Center for Computing Research, Instituto Politécnico Nacional, Mexico City 07738, Mexico

Mathematics, 2024, vol. 12, issue 22, 1-19

Abstract: Recent advancements in graph-based text representation, particularly with embedding models and transformers such as BERT, have shown significant potential for enhancing natural language processing (NLP) tasks. However, challenges related to data sparsity and limited interpretability remain, especially when working with small or imbalanced datasets. This paper introduces TTG-Text, a novel framework that strengthens graph-based text representation by integrating typical testors—a symbolic feature selection technique that refines feature importance while reducing dimensionality. Unlike traditional TF-IDF weighting, TTG-Text leverages typical testors to enhance feature relevance within text graphs, resulting in improved model interpretability and performance, particularly for smaller datasets. Our evaluation on a text classification task using a graph convolutional network (GCN) demonstrates that TTG-Text achieves a 95% accuracy rate, surpassing conventional methods and BERT with fewer required training epochs. By combining symbolic algorithms with graph-based models, this hybrid approach offers a more interpretable, efficient, and high-performing solution for complex NLP tasks.

Keywords: graph-based text representation; typical testors; text classification; TF-IDF; graph convolutional networks (GCNs); natural language processing (NLP); symbolic feature selection; BERT; hybrid models; dimensionality reduction (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2024
References: View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-7390/12/22/3576/pdf (application/pdf)
https://www.mdpi.com/2227-7390/12/22/3576/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:12:y:2024:i:22:p:3576-:d:1521978

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().

 
Page updated 2025-03-19
Handle: RePEc:gam:jmathe:v:12:y:2024:i:22:p:3576-:d:1521978