Deeper Spatial Statistical Insights into Small Geographic Area Data Uncertainty
Daniel A. Griffith,
Yongwan Chun and
Monghyeon Lee
Additional contact information
Daniel A. Griffith: School of Economic, Political and Policy Sciences, The University of Texas at Dallas, 800 West Campbell Road, Richardson, TX 75080, USA
Yongwan Chun: School of Economic, Political and Policy Sciences, The University of Texas at Dallas, 800 West Campbell Road, Richardson, TX 75080, USA
Monghyeon Lee: Memory Business Division, Samsung Electronics Co. Ltd., 1, Samsungjeonja-ro, Hwaseong-si, Gyeonggi-do 18448, Korea
IJERPH, 2020, vol. 18, issue 1, 1-16
Abstract:
Small areas refer to small geographic areas, a more literal meaning of the phrase, as well as small domains (e.g., small sub-populations), a more figurative meaning of the phrase. With post-stratification, even with big data, either case can encounter the problem of small local sample sizes, which tend to inflate local uncertainty and undermine otherwise sound statistical analyses. This condition is the opposite of that afflicting statistical significance in the context of big data. These two definitions can also occur jointly, such as during the standardization of data: small geographic units may contain small populations, which in turn have small counts in various age cohorts. Accordingly, big spatial data can become not-so-big spatial data after post-stratification by geography and, for example, by age cohorts. This situation can be ameliorated to some degree by the large volume of and high velocity of big spatial data. However, the variety of any big spatial data may well exacerbate this situation, compromising veracity in terms of bias, noise, and abnormalities in these data. The purpose of this paper is to establish deeper insights into big spatial data with regard to their uncertainty through one of the hallmarks of georeferenced data, namely spatial autocorrelation, coupled with small geographic areas. Impacts of interest concern the nature, degree, and mixture of spatial autocorrelation. The cancer data employed (from Florida for 2001–2010) represent a data category that is beginning to enter the realm of big spatial data; its volume, velocity, and variety are increasing through the widespread use of digital medical records.
Keywords: big data; big spatial data; cancer; small area; small geographic area (search for similar items in EconPapers)
JEL-codes: I I1 I3 Q Q5 (search for similar items in EconPapers)
Date: 2020
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (1)
Downloads: (external link)
https://www.mdpi.com/1660-4601/18/1/231/pdf (application/pdf)
https://www.mdpi.com/1660-4601/18/1/231/ (text/html)
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:gam:jijerp:v:18:y:2020:i:1:p:231-:d:470716
Access Statistics for this article
IJERPH is currently edited by Ms. Jenna Liu
More articles in IJERPH from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().