EconPapers    
Economics at your fingertips  
 

Public Perception of Urban Recreational Spaces Based on Large Vision–Language Models: A Case Study of Beijing’s Third Ring Area

Yan Wang, Xin Hou (), Xuan Wang and Wei Fan
Additional contact information
Yan Wang: School of Architecture, Tianjin University, Tianjin 300072, China
Xin Hou: School of Architecture, Tianjin University, Tianjin 300072, China
Xuan Wang: School of Architecture, Tianjin University, Tianjin 300072, China
Wei Fan: School of Architecture, Tianjin University, Tianjin 300072, China

Land, 2025, vol. 14, issue 11, 1-36

Abstract: Urban recreational spaces (URSs) are pivotal for enhancing resident well-being, making the accurate assessment of public perceptions crucial for quality optimization. Compared to traditional surveys, social media data provide a scalable means for multi-dimensional perception assessment. However, existing studies predominantly rely on single-modal data, which limits the comprehensive capturing of complex perceptions and lacks interpretability. To address these gaps, this study employs cutting-edge large vision–language models (LVLMs) and develops an interpretable model, Qwen2.5-VL-7B-SFT, through supervised fine-tuning on a manually annotated dataset. The model integrates visual-linguistic features to assess four perceptual dimensions of URSs: esthetics, attractiveness, cultural significance, and restorativeness. Crucially, we generate textual evidence for our judgments by identifying the key spatial elements and emotional characteristics associated with specific perceptions. By integrating multi-source built environment data with Optuna-optimized machine learning and SHAP analysis, we further decipher the nonlinear relationships between built environment variables and perceptual outcomes. The results are as follows: (1) Interpretable LVLMs are highly effective for urban spatial perception research. (2) URSs within Beijing’s Third Ring Road fall into four typologies, historical heritage, commercial entertainment, ecological-natural, and cultural spaces, with significant correlations observed between physical elements and emotional responses. (3) Historical heritage accessibility and POI density are identified as key predictors of public perception. Positive perception significantly improves when a block’s POI functional density exceeds 4000 units/km 2 or when its 500 m radius encompasses more than four historical heritage sites. Our methodology enables precise quantification of multidimensional URS perceptions, links built environment elements to perceptual mechanisms, and provides actionable insights for urban planning.

Keywords: large vision–language models; urban recreational spaces; urban spatial perception; social media data; multi-modal data (search for similar items in EconPapers)
JEL-codes: Q15 Q2 Q24 Q28 Q5 R14 R52 (search for similar items in EconPapers)
Date: 2025
References: Add references at CitEc
Citations:

Downloads: (external link)
https://www.mdpi.com/2073-445X/14/11/2155/pdf (application/pdf)
https://www.mdpi.com/2073-445X/14/11/2155/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jlands:v:14:y:2025:i:11:p:2155-:d:1782235

Access Statistics for this article

Land is currently edited by Ms. Carol Ma

More articles in Land from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().

 
Page updated 2025-10-30
Handle: RePEc:gam:jlands:v:14:y:2025:i:11:p:2155-:d:1782235