Abstract
In recent decades, statisticians have been increasingly attending to data that exhibit non-Gaussian behaviors such as asymmetry and multifarious heavy-tail properties. As a result, the assumptions of symmetry and fixed tail weight in Gaussian models have become restrictive and may fail to capture the intrinsic properties of the data. To address the limitations of the Gaussian models, a variety of skewed models has been proposed, of which the popularity has grown rapidly. These skewed models introduce parameters that govern skewness and tail weight. Among various proposals in the literature, unified skewed distributions, such as the Unified Skew-Normal (SUN), have received considerable attention. Nonetheless, the SUN, in its current form, suffers from non-identifiability and, therefore, is inappropriate for applications. Hence, in this work, we pinpoint the rationales behind its non-identifiability and propose remedies such as sub-models that address this issue. In addition, we further study the Unified Skew-t (SUT) distribution as an extension of the SUN, introducing one more parameter to control the tail-weight. Moreover, we study their linear transformations, marginal and conditional distributions, Mardia's measures, canonical forms among numerous others. We further work on the structures of the skewness matrix and latent correlation matrix of the SUN distribution and architect the Unified Skew-Normal distribution with independent sets of latent variables (SUNSET), which possesses many preferable properties such as computational efficiency at the inference stage. We also apply the SUN to various settings such as random fields by constructing unified skewed processes. To establish a solid foundation, we conduct a comprehensive analysis and assessment of the efficiency of the most commonly used parameterizations of the Matérn covariance (frequently used to model spatial dependence structure) with specific recommendations for their employments under differing circumstances. In addition, because spatial and spatio-temporal data are large in dimension, leading to intractable likelihood function or parameter space, there has been a rising interest in the application of neural networks in the modeling and inference of such data. We propose combining traditional spatial models, like the stochastic partial differential equation (SPDE) that uses the Matérn covariance function, with sparse and random recurrent neural networks (RNNs) to model data with complex dynamics. Moreover, we introduce a new and more general non-Gaussian process, called the generalized SUN (GSUN) spatial process, and develop its inference mechanism (Neural Bayes Estimator) using deep graphical attention networks (GATs) and encoder transformers. We show that the GSUN process is different from existing spatial processes and Tukey g-and-h processes.
Brief Biography
Kesen Wang is a Ph.D. student in the Spatio-Temporal Statistics and Data Science group at King Abdullah University of Science and Technology (KAUST). He received his M.S. degree in Statistics at KAUST, supervised by Prof. Marc Genton, and B.S. degree in Mathematics (Statistics Track) in 2019 from University of Maryland College Park, USA. His research interest includes multivariate Gaussian and non-Gaussian spatio-temporal statistics and deep learning for spatial and spatio-temporal datasets.