Adapting Hidden Naive Bayes for Text Classification

Gan, Shengfeng; Shao, Shiqi; Chen, Long; Yu, Liangjun; Jiang, Liangxiao

Adapting Hidden Naive Bayes for Text Classification

Shengfeng Gan, Shiqi Shao, Long Chen, Liangjun Yu and Liangxiao Jiang
Additional contact information
Shengfeng Gan: College of Computer, Hubei University of Education, Wuhan 430205, China
Shiqi Shao: School of Computer Science, China University of Geosciences, Wuhan 430074, China
Long Chen: School of Computer Science, China University of Geosciences, Wuhan 430074, China
Liangjun Yu: College of Computer, Hubei University of Education, Wuhan 430205, China
Liangxiao Jiang: School of Computer Science, China University of Geosciences, Wuhan 430074, China

Mathematics, 2021, vol. 9, issue 19, 1-14

Abstract: Due to its simplicity, efficiency, and effectiveness, multinomial naive Bayes (MNB) has been widely used for text classification. As in naive Bayes (NB), its assumption of the conditional independence of features is often violated and, therefore, reduces its classification performance. Of the numerous approaches to alleviating its assumption of the conditional independence of features, structure extension has attracted less attention from researchers. To the best of our knowledge, only structure-extended MNB (SEMNB) has been proposed so far. SEMNB averages all weighted super-parent one-dependence multinomial estimators; therefore, it is an ensemble learning model. In this paper, we propose a single model called hidden MNB (HMNB) by adapting the well-known hidden NB (HNB). HMNB creates a hidden parent for each feature, which synthesizes all the other qualified features’ influences. For HMNB to learn, we propose a simple but effective learning algorithm without incurring a high-computational-complexity structure-learning process. Our improved idea can also be used to improve complement NB (CNB) and the one-versus-all-but-one model (OVA), and the resulting models are simply denoted as HCNB and HOVA, respectively. The extensive experiments on eleven benchmark text classification datasets validate the effectiveness of HMNB, HCNB, and HOVA.

Keywords: text classification; multinomial naive Bayes; hidden multinomial naive Bayes; attribute conditional independence assumption; structure extension (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2021
References: View complete reference list from CitEc
Citations: View citations in EconPapers (2)

Downloads: (external link)
https://www.mdpi.com/2227-7390/9/19/2378/pdf (application/pdf)
https://www.mdpi.com/2227-7390/9/19/2378/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:9:y:2021:i:19:p:2378-:d:642716

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().