EconPapers    
Economics at your fingertips  
 

An Integrated Algorithm with Feature Selection, Data Augmentation, and XGBoost for Ovarian Cancer

Jingxun Cai, Zne-Jung Lee (), Zhihxian Lin (), Chih-Hung Hsu () and Yun Lin
Additional contact information
Jingxun Cai: Graduate School of New Generation Electronic Information Engineer, School of Advanced Manufacturing, Fuzhou University, Quanzhou 362200, China
Zne-Jung Lee: Department of Electronic and Information Engineering, School of Advanced Manufacturing, Fuzhou University, Quanzhou 362200, China
Zhihxian Lin: Department of Electronic and Information Engineering, School of Advanced Manufacturing, Fuzhou University, Quanzhou 362200, China
Chih-Hung Hsu: Institute of Logistics Engineering and Management, College of Transportation, Fujian University of Technology, Fuzhou 350118, China
Yun Lin: School of Intelligent Construction, Fuzhou University of International Studies and Trade, Fuzhou 350200, China

Mathematics, 2024, vol. 12, issue 24, 1-18

Abstract: Ovarian cancer is one of the most aggressive gynecological cancers due to its high invasion and chemoresistance. It not only has a high incidence rate but also tops the list of mortality rates. Its subtle early symptoms make subsequent diagnosis difficult, significantly delaying timely treatment for patients. Once ovarian cancer reaches an advanced stage, the complexity and difficulty of treatment increase substantially, affecting patient survival rates. Therefore, it is crucial for both medical professionals and patients to remain highly vigilant about the early signs of ovarian cancer to ensure timely intervention. In recent years, ovarian cancer prediction research has advanced, allowing for the analysis of the likelihood and type of cancer based on patients’ genetic data. With the rapid development of machine learning, numerous efficient classification prediction models have emerged. These new technologies offer significant opportunities and potential for developing ovarian cancer diagnostic prediction methods. However, traditional approaches often struggle to achieve satisfactory classification accuracy in high-dimensional genetic datasets with small sample sizes. This research offers a prediction model utilizing genomic data to enhance the early diagnosis rate of ovarian cancer, incorporating feature selection, data augmentation through adversarial conditional generative adversarial networks (AC-GAN), and an extreme gradient boosting (XGBoost) classifier. First, we can simplify the original genetic dataset through feature selection methods, removing irrelevant variables and noise, thereby improving the model’s predictive accuracy. Following dimensionality reduction, AC-GAN enriches the data, producing more realistic genetic samples to enhance the model’s generalization capacity. Finally, the XGBoost classifier is applied to classify the augmented data, achieving efficient predictions for ovarian cancer. These research findings strongly demonstrate that the diagnostic method proposed in this paper has a significant advantage in the predictive diagnosis of ovarian cancer, with an accuracy of 99.01% that surpasses the current technologies in use. Additionally, the algorithm identifies twelve genes highly relevant to ovarian cancer, providing valuable insights for physicians during diagnosis.

Keywords: integrated algorithm; ovarian cancer; feature selection; data augmentation; extreme gradient boosting; generative adversarial networks (search for similar items in EconPapers)
JEL-codes: C (search for similar items in EconPapers)
Date: 2024
References: View references in EconPapers View complete reference list from CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2227-7390/12/24/4041/pdf (application/pdf)
https://www.mdpi.com/2227-7390/12/24/4041/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jmathe:v:12:y:2024:i:24:p:4041-:d:1550869

Access Statistics for this article

Mathematics is currently edited by Ms. Emma He

More articles in Mathematics from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().

 
Page updated 2025-03-19
Handle: RePEc:gam:jmathe:v:12:y:2024:i:24:p:4041-:d:1550869