EconPapers    
Economics at your fingertips  
 

Strategies to access web-enabled urban spatial data for socioeconomic research using R functions

Andrés Vallone, Coro Chasco Yrigoyen and Beatriz Sánchez
Additional contact information
Andrés Vallone: Universidad Católica del Norte
Beatriz Sánchez: Catholic University of Ávila

Journal of Geographical Systems, 2020, vol. 22, issue 2, No 3, 217-239

Abstract: Abstract Since the introduction of the World Wide Web in the 1990s, available information for research purposes has increased exponentially, leading to a significant proliferation of research based on web-enabled data. Nowadays the use of internet-enabled databases, obtained by either primary data online surveys or secondary official and non-official registers, is common. However, information disposal varies depending on data category and country and specifically, the collection of microdata at low geographical level for urban analysis can be a challenge. The most common difficulties when working with secondary web-enabled data can be grouped into two categories: accessibility and availability problems. Accessibility problems are present when the data publication in the servers blocks or delays the download process, which becomes a tedious reiterative task that can produce errors in the construction of big databases. Availability problems usually arise when official agencies restrict access to the information for statistical confidentiality reasons. In order to overcome some of these problems, this paper presents different strategies based on URL parsing, PDF text extraction, and web scraping. A set of functions, which are available under a GPL-2 license, were built in an R package to specifically extract and organize databases at the municipality level (NUTS 5) in Spain for population, unemployment, vehicle fleet, and firm characteristics.

Keywords: Web scraping; URL parsing; Spatial microdata; Spain (search for similar items in EconPapers)
JEL-codes: C81 C88 R58 (search for similar items in EconPapers)
Date: 2020
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (2)

Downloads: (external link)
http://link.springer.com/10.1007/s10109-019-00309-y Abstract (text/html)
Access to full text is restricted to subscribers.

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:kap:jgeosy:v:22:y:2020:i:2:d:10.1007_s10109-019-00309-y

Ordering information: This journal article can be ordered from
http://www.springer. ... ce/journal/10109/PS2

DOI: 10.1007/s10109-019-00309-y

Access Statistics for this article

Journal of Geographical Systems is currently edited by Manfred M. Fischer and Antonio Páez

More articles in Journal of Geographical Systems from Springer
Bibliographic data for series maintained by Sonal Shukla () and Springer Nature Abstracting and Indexing ().

 
Page updated 2025-04-07
Handle: RePEc:kap:jgeosy:v:22:y:2020:i:2:d:10.1007_s10109-019-00309-y