EconPapers    
Economics at your fingertips  
 

Machine learning-driven risk stratification for distant metastasis in gastric cancer: A comparative study of clinical features and composite indices integrated models

Shaoxue Yang and Han Lei

PLOS ONE, 2025, vol. 20, issue 10, 1-18

Abstract: Objective: Distant metastasis (DM) of gastric cancer (GC) represents a significant health challenge due to its high mortality rates, necessitating advancements in early detection and management strategies. The objective of this study was to create a machine learning (ML) model that is interpretable for preoperative prediction of DM in GC. Methods: We retrospectively analyzed 1,009 GC patients, of which 769 were from Zhejiang Cancer Hospital as development cohort and 240 from Zhejiang Provincial Hospital of Chinese Medicine as external test cohort. Nine clinical features, and four composite indices derived from ten laboratory indicators were selected as candidate features. The dataset was balanced using the borderline Synthetic Minority Over-sampling Technique (SMOTE) and the Edited Nearest Neighbors (ENN) under-sampling method. Univariate and multivariate analyses were used to identified key metastasis-related features. Based on the identified features, we developed predictive models incorporating five ML algorithms, with performance evaluated via receive operating characteristic (ROC) curves, recall, precision-recall (PR) curves. Ultimately, Shapley additive explanations (SHAP) analysis were applied to rank the feature importance and explain the final model. Results: Univariate and multivariate analyses identified five metastasis-related features: cT stage, cN stage, differentiation grade, PLR and TMI. Logistic Regression emerged as the optimal predictive model with the highest area under the curve (AUC) of 0.942 (95% CI: 0.922–0.962), Recall of 0.895 (95% CI: 0.843–0.947), and AUPRC of 0.889 (95% CI: 0.867–0.911) among five models. Additionally, the internal and external test cohorts recorded AUC values of 0.935 (95% CI: 0.897–0.972) and 0.879 (95% CI: 0.833–0.926), respectively. The SHAP analysis revealed the features that played a significant role in the predictions made by the model. Conclusion: This ML model integrates clinical features and composite indices to predict GC metastasis risk, supported by an online tool to guide preoperative decision-making.

Date: 2025
References: Add references at CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0335258 (text/html)
https://journals.plos.org/plosone/article/file?id= ... 35258&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pone00:0335258

DOI: 10.1371/journal.pone.0335258

Access Statistics for this article

More articles in PLOS ONE from Public Library of Science
Bibliographic data for series maintained by plosone ().

 
Page updated 2025-11-16
Handle: RePEc:plo:pone00:0335258