On Cross Validation, Log Gaussian Cox Process and Variational Bayes
This dissertation advances Bayesian computation by developing a cross validation approach to assess LGCPs defined on Euclidean, manifolds or network domains and improving skewed posterior approximations for LGMs.
Overview
The log Gaussian Cox process (LGCP) is arguably the most used model for spatial point patterns (SPP) analysis. Although well established methods and software for fitting these models are available, minimal progress has been made to evaluate them in a Bayesian context. In this dissertation, we propose a cross validation approach to assess LGCPs defined on Euclidean, manifolds or network domains. The first challenge is that, in SPP analysis the concept of datum-to-leave-out can be redefined as region-to-leave-out and used with the logarithmic scoring rule for joint predictions. The second and practical challenge is computational efficiency, mainly to circumvent the need for multiple model refits.
By viewing cross validation as a data removal problem, we propose a unified perspective of information deletion leading to a novel fundamental result of Bayes' theorem as an optimal information deletion rule. This generalization, à la Zellner, provides a systematic approach to Bayesian unlearning and a framework to obtain accurate approximations of the joint leave-data-out predictive distribution within the class of Latent Gaussian models (LGM), to which LGCPs belong.
A further implication of this new perspective is a clear duality between learning and unlearning, underscoring the importance of accurate posterior distributions. Consequently, this dissertation also contributes to Bayesian computation by proposing improved skewed posterior approximations for LGMs, obtained by efficiently combining variational Bayes and orthogonal polynomials.
Together, the results from this dissertation, on some fundamental properties of Bayesianism and computation, offer a coherent methodology for scalable inference and model assessment in latent Gaussian models, opening the door to more complex model exploration and robust data analysis using the r-inla, inlabru, rSPDE, and MetricGraph packages.
Presenters
Brief Biography
Hans obtained a B.S. in Applied Mathematics from the Federal University of Rio Grande, and holds a M.S. in Statistics from the Federal University of São Carlos and the University of São Paulo, in Brazil. He also worked as a data scientist for a global credit bureau.