Criticism and robustification of latent Gaussian models

Latent Gaussian models (LGM) are widely used but struggle with certain datasets that contain non-Gaussian features, such as sudden jumps or spikes. This dissertation aims to provide tools for researchers to check the adequacy of the fitted LGM (criticism); if the check fails, offer efficient and user-friendly implementations of latent non-Gaussian models, which lead to more robust inferences (robustification).

Overview

Abstract

Latent Gaussian models (LGMs) are perhaps the most commonly used class of statistical models with broad applications in various fields, including biostatistics, econometrics, and spatial modeling. LGMs assume that a set of unobserved or latent variables follow a Gaussian distribution, commonly used to model spatial and temporal dependence in the data. The availability of computational tools, such as R-INLA, that permit fast and accurate estimation of LGMs has made their use widespread. Nevertheless, it is easy to find datasets that contain inherently non-Gaussian features, such as sudden jumps or spikes, that adversely affect the inferences and predictions made from an LGM. These datasets require more general latent non-Gaussian models (LnGMs) that can automatically handle these non-Gaussian features by assuming more flexible and robust non-Gaussian distributions on the latent variables. However, fast implementation and easy-to-use software are lacking, which prevents LnGMs from becoming widely applicable.
 
This dissertation aims to tackle these challenges and provide ready-to-use implementations for the R-INLA package. We view scientific learning as an iterative process involving model criticism followed by model improvement and robustification. Thus, the first step is to provide a framework that allows researchers to criticize and check the adequacy of an LGM without fitting the more expensive LnGM. We employ concepts from Bayesian sensitivity analysis to check the influence of the latent Gaussian assumption on the statistical answers and Bayesian predictive checking to check if the fitted LGM can predict important features in the data. In many applications, this procedure will suffice to justify using an LGM. For cases where this check fails, we provide fast and scalable implementations of LnGMs based on variational Bayes and Laplace approximations. The approximation leads to an LGM that downweights extreme events in the latent variables, reducing their impact and leading to more robust inferences. Each step, the first of LGM criticism and the second of LGM robustification, can be executed in R-INLA, requiring only the addition of a few lines of code. This results in a robust workflow that applied researchers can readily use.

Brief Biography

Rafael obtained a B.Sc. degree in Engineering Physics in 2017 from the University of Lisbon (Instituto Superior Tecnico). Then he joined a M.Sc. degree in Mathematics and Applications at the same university, focused on Statistics and Data Science. Rafael received two diplomas of academic merit issued by the University of Lisbon, one during his B.Sc. in Engineering Physics, and another during his M.Sc. in Mathematics and Applications. Furthermore, in 2018 he obtained a research grant (by the Portuguese foundation for science and technology) for the development of research work in extreme value theory and spatial statistics. In 2013 he earned an asteroid discovery award confirmed by the Minor Planet Center of the University of Harvard with name 2013 EZ7.

Presenters