Extracting toponyms from OpenStreetMap and other gazetteers: comparing representational accuracy in multilingual contexts
Francesco-Alessio Ursini () and
Giuseppe Samo ()
Additional contact information
Francesco-Alessio Ursini: Central China Normal University
Giuseppe Samo: Beijing Language and Culture University
Palgrave Communications, 2025, vol. 12, issue 1, 1-16
Abstract:
Abstract The goal of this paper is to investigate OpenStreetMap as a research tool by analysing what pros and cons this platform offers to linguistics and to GIS disciplines. To reach this goal, the paper analyses how this platform represents places as geographical units and toponyms (i.e. place names) as linguistic units referring to places. The paper presents two previous studies that featured a novel procedure for toponym extraction and its application to OpenStreetMap toponym data. These two studies focused on distinct scales and densities of geographical distribution in multi-lingual contexts: city level (Macao); mixed regional and national level (Italy). The studies also included a comparison of these data with data originating from an authoritative geographic source (e.g. Italian street directories). The present paper extends the analysis and results from these studies by showing that via a single extraction algorithm, one can obtain all the relevant toponyms from overpass-turbo, a platform including OpenStreetMap’s textual information, and from other gazetteers. For each level of analysis, the paper shows that toponyms come in different combinations of multi-lingual formats: Chinese and Portuguese for Macao, Italian, local dialects (e.g. Genoese), and minority languages (e.g. German) for Italy. From these data, the paper offers an analysis of language-specific features, methodological challenges, and informational accuracy of each database. The paper proposes that OpenStreetMap may be as reliable as authoritative sources; however, one must apply cross-source comparison during data analysis, to confirm OpenStreetMap-based data. The paper concludes by discussing the current role of OpenStreetMap as an information database in toponym extraction. The paper discusses the use of OSM in linguistics and GIS disciplines, and how these uses can offer theoretical insights informing research in these disciplines.
Date: 2025
References: View complete reference list from CitEc
Citations:
Downloads: (external link)
http://link.springer.com/10.1057/s41599-025-05025-1 Abstract (text/html)
Access to full text is restricted to subscribers.
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:pal:palcom:v:12:y:2025:i:1:d:10.1057_s41599-025-05025-1
Ordering information: This journal article can be ordered from
https://www.nature.com/palcomms/about
DOI: 10.1057/s41599-025-05025-1
Access Statistics for this article
More articles in Palgrave Communications from Palgrave Macmillan
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().