EconPapers    
Economics at your fingertips  
 

Urban Subway Station Site Selection Prediction Based on Clustered Demand and Interpretable Machine Learning Models

Yun Liu, Xin Yao, Hang Lv, Dingjie Zhou (), Zhiqiang Xie (), Xiaoqing Zhao (), Quan Zhu and Cong Chai
Additional contact information
Yun Liu: College of Earth Sciences, Yunnan University, Kunming 650500, China
Xin Yao: College of Earth Sciences, Yunnan University, Kunming 650500, China
Hang Lv: College of Earth Sciences, Yunnan University, Kunming 650500, China
Dingjie Zhou: Yunnan Provincial Institute of Surveying and Mapping, Kunming 650500, China
Zhiqiang Xie: College of Earth Sciences, Yunnan University, Kunming 650500, China
Xiaoqing Zhao: College of Earth Sciences, Yunnan University, Kunming 650500, China
Quan Zhu: Kunming Urban Transport Institute, Kunming 650500, China
Cong Chai: Kunming Urban Transport Institute, Kunming 650500, China

Land, 2025, vol. 14, issue 8, 1-29

Abstract: With accelerating urbanization, the development of rail transit systems—particularly subways—has become a key strategy for alleviating urban traffic congestion. However, existing studies on subway station site selection often lack a spatially continuous evaluation of site suitability across the entire study area. This may lead to a disconnect between planning and actual demand, resulting in issues such as “overbuilt infrastructure” or the “island effect.” To address this issue, this study selects Kunming City, China, as the study area, employs the K-means++ algorithm to cluster existing subway stations based on passenger flow, integrates multi-source spatial data, applies a random forest algorithm for optimal positive sample selection and driving factor identification, and subsequently uses a LightGBM-SHAP explainable machine learning framework to develop a predictive model for station location based on mathematical modeling. The main findings of the study are as follows: (1) Using the random forest model, 20 key drivers influencing site selection were identified. SHAP analysis revealed that the top five contributing factors were connectivity, nighttime lighting, road network density, transportation service, and residence density. Among these, transportation-related factors accounted for three out of five and emerged as the primary determinants of subway station site selection. (2) The site selection prediction model exhibited strong performance, achieving an R 2 value of 0.95 on the test set and an average R 2 of 0.79 during spatial 5-fold cross-validation, indicating high model reliability. The spatial distribution of predicted suitability indicated that the core urban area within the Second Ring Road exhibited the highest suitability, with suitability gradually declining toward the periphery. High-suitability areas outside the Third Ring Road in suburban regions were primarily aligned along existing subway lines. (3) The cumulative predicted probability within a 300 m buffer zone around each station was positively correlated with passenger flow levels. Overlaying the predicted results with current station locations revealed strong spatial consistency, indicating that the model outputs closely align with the actual spatial layout and passenger usage intensity of existing stations. These findings provide valuable decision-making support for optimizing subway station layouts and planning future transportation infrastructure, offering both theoretical and practical significance for data-driven site selection.

Keywords: subway station site selection; spatial suitability prediction; machine learning; SHAP interpretation; GIS (search for similar items in EconPapers)
JEL-codes: Q15 Q2 Q24 Q28 Q5 R14 R52 (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2073-445X/14/8/1612/pdf (application/pdf)
https://www.mdpi.com/2073-445X/14/8/1612/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jlands:v:14:y:2025:i:8:p:1612-:d:1720508

Access Statistics for this article

Land is currently edited by Ms. Carol Ma

More articles in Land from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().

 
Page updated 2025-08-09
Handle: RePEc:gam:jlands:v:14:y:2025:i:8:p:1612-:d:1720508