Machine learning and rule-based embedding techniques for classifying text documents
Asmaa M. Aubaid (),
Alok Mishra () and
Atul Mishra ()
Additional contact information
Asmaa M. Aubaid: Ministry of Higher Education and Scientific Research/Science and Technology
Alok Mishra: Norwegian University of Science and Technology
Atul Mishra: BML Munjal University
International Journal of System Assurance Engineering and Management, 2024, vol. 15, issue 12, No 13, 5637-5652
Abstract:
Abstract Rapid expansion of electronic document archives and the proliferation of online information have made it incredibly difficult to categorize text documents. Classification helps in information retrieval from a conceptual framework. This study addresses the challenge of efficiently categorizing text documents amidst the vast electronic document landscape. Employing machine learning models and a novel document categorization method, W2vRule, we compare its performance with traditional methods. Emphasizing the importance of tuning hyperparameters for optimal performance, the research recommends the W2vRule, a word-to-vector rule-based framework, for improved association-based text classification. The study used the Reuters Newswire dataset. Findings show that W2vRule and machine learning can effectively tell apart important categories. Rule-based approaches perform better than Naive Bayes, BayesNet, Decision Tables, and others in terms of performance metrics.
Keywords: Bayesian; BayesNet; Lazy; IBK; Naïve Bayes; IBL; Reuters; Text analytics (search for similar items in EconPapers)
Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations:
Downloads: (external link)
http://link.springer.com/10.1007/s13198-024-02555-w Abstract (text/html)
Access to the full text of the articles in this series is restricted.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:spr:ijsaem:v:15:y:2024:i:12:d:10.1007_s13198-024-02555-w
Ordering information: This journal article can be ordered from
http://www.springer.com/engineering/journal/13198
DOI: 10.1007/s13198-024-02555-w
Access Statistics for this article
International Journal of System Assurance Engineering and Management is currently edited by P.K. Kapur, A.K. Verma and U. Kumar
More articles in International Journal of System Assurance Engineering and Management from Springer, The Society for Reliability, Engineering Quality and Operations Management (SREQOM),India, and Division of Operation and Maintenance, Lulea University of Technology, Sweden
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().