EconPapers    
Economics at your fingertips  
 

Improving the Accuracy of Misclassified Breast Cancer Data using Machine Learning

Rong-Ho Lin (), Benjamin Kofi Kujabi (), Chun-Ling Chuang (), Yueh-Chung Chen () and Chang-Ming Chen ()
Additional contact information
Rong-Ho Lin: National Taipei University of Technology Department of Industrial Engineering and Management. 1, Sec. 3, Zhongxiao E. Rd., Taipei 10608 Taiwan, ROC. Taipei, Taiwan
Benjamin Kofi Kujabi: National Taipei University of Technology Department of Industrial Engineering and Management. 1, Sec. 3, Zhongxiao E. Rd., Taipei 10608 Taiwan, ROC. Taipei, Taiwan
Chun-Ling Chuang: Kainan University Department of Information Management
Yueh-Chung Chen: Division of Cardiology, Department of Internal Medicine, Taipei City Hospital, Renai Branch, Taipei
Chang-Ming Chen: Radiation Oncology Department Tri-Service General Hospital

Eximia Journal, 2022, vol. 4, issue 1, 19-32

Abstract: Background: Breast cancer is the most common cancer among women. Many studies have made significant gains to classify breast cancer tumors with much emphasis on the best algorithm and highest classification accuracy but with limited interest in correcting misclassified data (Type 1 and Type 2 errors). Objective: This research proposes a novel hybrid integrated system of WEKA (Waikato Environment for Knowledge Analysis) and case-based reasoning (CBR) using myCBR plugin with protege for the classification of breast cancer tumors and correction of misclassified data (Type 1 and Type 2 errors) of breast cancer tumors. Methods: The Wisconsin breast cancer dataset retrieved from the Wisconsin university repository was used in this research. The dataset contained 699 instances, 2 classes (malignant and benign), and 9 integer-valued attributes. To determine the breast cancer tumors, we applied the J48, IBK, LibSVM, JRip, and Multi-Layer Perceptron (MLP) classifiers to classify the breast cancer tumors. Next, the myCBR plugin with protege was used as an advanced modeling technique to correct the misclassified data and enhance its accuracy. Results: The proposed model performance evaluation was based on sensitivity, specificity, precision, and accuracy. Interestingly, based on the analyses, the IBK classifier had the highest misclassified data and the integrated system improved its classification accuracy from 95.61% to 98.53%. Conclusion: The findings demonstrated that the integration of WEKA and myCBR plugin with protege had unprecedented results with misclassified data. Thus, providing accurate diagnostics procedures for distinguishing between benign and malignant.

Keywords: Misclassified data; Classifiers; WEKA; myCBR; protege (search for similar items in EconPapers)
Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://eximiajournal.com/index.php/eximia/article/view/100/53 (application/pdf)
https://eximiajournal.com/index.php/eximia/article/view/100 (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:tec:eximia:v:4:y:2022:i:1:p:19-32

Access Statistics for this article

Eximia Journal is currently edited by Tanase Tasente

More articles in Eximia Journal from Plus Communication Consulting SRL
Bibliographic data for series maintained by Tanase Tasente ().

 
Page updated 2025-03-20
Handle: RePEc:tec:eximia:v:4:y:2022:i:1:p:19-32