Combined analysis of spatially misaligned data using Gaussian fields and the stochastic partial differential equation approach
Spatially misaligned data are becoming increasingly common due to advances in both data collection and management in a wide range of scientific disciplines including the epidemiological, ecological and environmental fields. Here, we present a Bayesian geostatistical model for fusion of data obtained at point and areal resolutions. The model assumes that underlying all observations there is a spatially continuous variable that can be modeled using a Gaussian random field process.
Overview
Abstract
Spatially misaligned data are becoming increasingly common due to advances in both data collection and management in a wide range of scientific disciplines including the epidemiological, ecological and environmental fields. Here, we present a Bayesian geostatistical model for fusion of data obtained at point and areal resolutions. The model assumes that underlying all observations there is a spatially continuous variable that can be modeled using a Gaussian random field process. The model is fitted using the integrated nested Laplace approximation (INLA) and the stochastic partial differential equation (SPDE) approaches. In the SPDE approach, a continuously indexed Gaussian random field is represented as a discretely indexed Gaussian Markov random field (GMRF) by means of a finite basis function defined on a triangulation of the region of study. In order to allow the combination of point and areal data, a new projection matrix for mapping the GMRF from the observation locations to the triangulation nodes is proposed which takes into account the types of data to be combined. The performance of the model is examined via simulation when it is fitted to (i) point, (ii) areal, and (iii) point and areal data combined to predict several simulated surfaces that can appear in real settings. The model is also applied to predict the concentration of fine particulate matter (PM2.5) in Los Angeles and Ventura counties, USA, during 2011. The results show that the combination of point and areal data provides better predictions than if the method is applied to just one type of data, and this is consistent over both simulated and real data. We conclude the approach presented may be a helpful advance in the area of spatial statistics by providing a useful tool that is applicable in a wide range of situations where information at different spatial resolutions needs to be combined.
Brief Biography
Paula Moraga (https://www.paulamoraga.com/) is an Assistant Professor of Statistics at King Abdullah University of Science and Technology (KAUST) and the Principal Investigator of the GeoHealth research group. Paula's research focuses on the development of innovative statistical methods and computational tools for geospatial data analysis and health surveillance, and the impact of her work has directly informed strategic policy in reducing disease burden in several countries. She has developed modeling architectures to understand the geographic and temporal patterns and identify targets for intervention of diseases such as malaria in Africa and cancer in Australia, and has worked on the development of a number of R packages for Bayesian risk modeling, detection of disease clusters, and risk assessment of travel-related spread of disease. Paula has published extensively in leading journals and is the author of the book 'Geospatial Health Data: Modeling and Visualization with R-INLA and Shiny' (2019, Chapman & Hall/CRC). Paula received her Ph.D. degree in Mathematics from the University of Valencia, and her Master's degree in Biostatistics from Harvard University.