Improved Machine Learning-Based Predictive Models for Breast Cancer Diagnosis
Abdur Rasool,
Chayut Bunterngchit,
Luo Tiejian,
Md. Ruhul Islam,
Qiang Qu and
Qingshan Jiang
Additional contact information
Abdur Rasool: University of Chinese Academy of Sciences, Beijing 101408, China
Chayut Bunterngchit: University of Chinese Academy of Sciences, Beijing 101408, China
Luo Tiejian: University of Chinese Academy of Sciences, Beijing 101408, China
Md. Ruhul Islam: Department of Electrical Engineering and Computer Science, University of Stavanger, 4044 Stavanger, Norway
Qiang Qu: Shenzhen Key Lab for High Performance Data Mining, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
Qingshan Jiang: Shenzhen Key Lab for High Performance Data Mining, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
IJERPH, 2022, vol. 19, issue 6, 1-19
Abstract:
Breast cancer death rates are higher than any other cancer in American women. Machine learning-based predictive models promise earlier detection techniques for breast cancer diagnosis. However, making an evaluation for models that efficiently diagnose cancer is still challenging. In this work, we proposed data exploratory techniques (DET) and developed four different predictive models to improve breast cancer diagnostic accuracy. Prior to models, four-layered essential DET, e.g., feature distribution, correlation, elimination, and hyperparameter optimization, were deep-dived to identify the robust feature classification into malignant and benign classes. These proposed techniques and classifiers were implemented on the Wisconsin Diagnostic Breast Cancer (WDBC) and Breast Cancer Coimbra Dataset (BCCD) datasets. Standard performance metrics, including confusion matrices and K-fold cross-validation techniques, were applied to assess each classifier’s efficiency and training time. The models’ diagnostic capability improved with our DET, i.e., polynomial SVM gained 99.3%, LR with 98.06%, KNN acquired 97.35%, and EC achieved 97.61% accuracy with the WDBC dataset. We also compared our significant results with previous studies in terms of accuracy. The implementation procedure and findings can guide physicians to adopt an effective model for a practical understanding and prognosis of breast cancer tumors.
Keywords: machine learning models; data exploratory techniques; breast cancer diagnosis; tumors classification (search for similar items in EconPapers)
JEL-codes: I I1 I3 Q Q5 (search for similar items in EconPapers)
Date: 2022
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (2)
Downloads: (external link)
https://www.mdpi.com/1660-4601/19/6/3211/pdf (application/pdf)
https://www.mdpi.com/1660-4601/19/6/3211/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jijerp:v:19:y:2022:i:6:p:3211-:d:767137
Access Statistics for this article
IJERPH is currently edited by Ms. Jenna Liu
More articles in IJERPH from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().