EconPapers    
Economics at your fingertips  
 

Data-driven differentiation analysis of urban high-tech industries: Research on bibliometrics and large language models

Hua Song, Jun Zeng, Yang Zheng, Han Huang and Hongyu Wang

PLOS ONE, 2026, vol. 21, issue 5, 1-38

Abstract: This study examines inter-city heterogeneity in China’s high-tech industries from a regional innovation systems (RIS) perspective, with a particular focus on how variations in knowledge production, technological application, and actor configurations are associated with divergent urban innovation trajectories. We compile more than 39,000 publications from the Web of Science (WOS) and nearly 10,000 patent records from the national patent database for the period 2016–2025, covering four representative cities—Wuhan, Chengdu, Hangzhou, and Tianjin—and four technological domains: artificial intelligence (AI), fiber-optic communication (FOC), intelligent connected vehicles (ICV), and storage chips (SC). The study develops an integrated analytical framework combining bibliometric analysis, co-word network modeling, collaboration network mapping, and large language model (LLM)–assisted semantic interpretation. LLMs are employed primarily in keyword cleaning, terminology standardization, and topic identification, improving the consistency and interpretability of textual metadata. Visualizations generated using VOSviewer highlight pronounced inter-city differences in technological portfolios, research priorities, and collaboration structures. The results suggest distinct urban innovation configurations across the four cities. Wuhan exhibits strong positioning in FOC and SC, reflecting a combined industry–academy orientation. Hangzhou shows high frontier intensity in AI and ICV, consistent with an industry-led and digitally driven innovation profile. Chengdu demonstrates substantial academic output but comparatively weaker evidence of technological translation, while Tianjin, despite a smaller overall scale, displays notable specialization in applied domains such as brain–computer interfaces and smart port technologies. Rather than replacing quantitative analysis, LLM-assisted interpretation supports the identification and contextualization of these patterns by enhancing semantic coherence and reducing noise in large-scale textual data. Overall, the proposed framework provides a reproducible and scalable approach for examining regional technological differentiation and is applicable to comparative studies of urban innovation systems across different regions and industrial contexts.

Date: 2026
References: Add references at CitEc
Citations:

Downloads: (external link)
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0348590 (text/html)
https://journals.plos.org/plosone/article/file?id= ... 48590&type=printable (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:plo:pone00:0348590

DOI: 10.1371/journal.pone.0348590

Access Statistics for this article

More articles in PLOS ONE from Public Library of Science
Bibliographic data for series maintained by plosone ().

 
Page updated 2026-05-17
Handle: RePEc:plo:pone00:0348590