Prof. Narayanaswamy Balakrishnan, Department of Mathematics and Statistics, McMaster University
Sunday, November 12, 2023, 15:30
- 16:30
B1-L4-R4102
Contact Person
Abstract
In this talk, I will describe the family of mean-mixtures of m
Monday, November 06, 2023, 09:00
- 17:00
KAUST
Contact Person
The workshop will feature the latest research on statistical methods and modeling to address real-world challenges in health, environment, and sustainability.
Thursday, November 02, 2023, 12:00
- 13:00
B9-L2-R2325
Contact Person
Abstract
Free boundary problems arise naturally in a range of mathemati
Erick Chacon Montalvan, Postdoctoral fellow, Statistics Geohealth Group, KAUST
Thursday, October 19, 2023, 12:00
- 13:00
B9-L2-R2325
Contact Person
Abstract
Spatial data analysis commonly needs to deal with spatial data
Thursday, October 12, 2023, 12:00
- 13:00
B9-L2-R2325
Contact Person
In this talk we propose and validate a Space Multiscale model for the description of particle diffusion in the presence of trapping boundaries. We start from a drift diffusion equation in which the drift term describes the effect of bubble traps, and it is simulated by the Lennard–Jones potential.
Thursday, October 05, 2023, 12:00
- 13:00
B9-L2-R2325
Contact Person
Abstract
The goal of the least squares method is to find the best linea
Thursday, September 28, 2023, 12:00
- 13:00
Building 9, Level 2, Room 2325
Contact Person
We study theoretical problems of fault diagnosis in circuits and switching networks, which are among the most fundamental models for computing Boolean functions.
Postdoctoral Research Fellow,
Biostatistics
Thursday, September 21, 2023, 12:00
- 13:00
Building 9, Level 2, Room 2325
Contact Person
Cross-validation is an algorithmic technique extensively used for estimating the prediction error, tuning the regularization parameter, and choosing between competing predictive rules.
Postdoctoral Fellow,
Statistics
Thursday, September 14, 2023, 12:00
- 13:00
Building 9, Level 2, Room 2325
Contact Person
Goodness-of-fit tests determine how well a set of observed data fits a particular probability distribution. They can also show if some categorical variable follows a hypothesized family of distributions.
PhD Student,
Statistics
Monday, September 11, 2023, 16:00
- 17:00
Building 3, Level 5, Room 5220; https://kaust.zoom.us/j/97801785665
Contact Person
The statistical modeling of spatial and extreme events provides a framework for the development of techniques and models to describe natural phenomena in a variety of environmental, geoscience, and climate science applications. In a changing climate, various natural hazards, such as wildfires, are believed to have evolved in frequency, size, and spatial extent, although regional responses may vary.
Thursday, September 07, 2023, 12:00
- 13:00
Building 9, Level 2, Room 2325
Contact Person
I will review some works on the high-friction limit (or small mass approximation) from Euler flows to advection-diffusion systems that are gradient flows, and related asymptotic problems in fluid mechanics. The formulation exploits the variational structure of compressible Euler flows and is connected to the interpretation of nonlinear Fokker-Planck systems as gradient flows in Wasserstein distance.
Postdoctoral Fellows,
Statistics
Thursday, August 31, 2023, 12:00
- 13:00
Building 9, Level 2, Room 2325
Contact Person
Estimating first-order intensity functions is crucial in the analysis of point patterns on linear networks, but selecting suitable bandwidths for non-parametric methods remains challenging. We propose an adaptive intensity estimator for the heating kernel that adjusts bandwidths based on data points, a novel approach in this context.
PhD Student,
Statistics
Sunday, June 04, 2023, 15:00
- 16:00
Building 4, Level 5, Room 5220; https://kaust.zoom.us/j/99802128930
Contact Person
The Integrated Nested Laplace Approximations (INLA) method has become a commonly used tool for researchers and practitioners to perform approximate Bayesian inference for various fields of applications. It has become essential to incorporate more complex models and expand the method’s capabilities with more features. In this dissertation, we contribute to the INLA method in different aspects.
Prof. Stefano Castruccio, Associate professor, University of Notre Dame, USA
Sunday, June 04, 2023, 10:00
- 11:00
Building 1, Level 4, Room 4102
Contact Person
It is widely acknowledged how the relentless surge of Volume, Velocity and Variety of data, as well as the simultaneous increase of computational resources have stimulated the development of data-driven methods with unprecedented flexibility and predictive power. However, not every environmental study entails a large data set: many applications ranging from astronomy or paleo-climatology have a high associated sampling cost and are instead constrained by physics-informed partial differential equations. Throughout the past few years, a new and powerful paradigm has emerged in the machine learning literature, merging data-driven and physics-informed problems, hence providing a unified framework for a whole spectrum of problems ranging from data-rich/context-poor to data-poor/context-rich. In this talk, I will present this new framework and discuss some of the most recent efforts to reformulate it as a stochastic model-based approach, thereby allowing calibrated uncertainty quantification.
PhD Student,
Statistics
Tuesday, May 30, 2023, 15:30
- 17:30
Building 1, Room 4102; https://kaust.zoom.us/my/zhedong
Contact Person
The commonly used leave-one-out and K-fold cross-validation methods are not suitable for structured models with multiple prediction tasks. To overcome this limitation, we introduce leave-group-out cross-validation, which allows groups to adapt to different tasks. We propose an automatic group construction method and provide an efficient approximation for latent Gaussian models. Moreover, this method is conveniently implemented in the R-INLA software.
PhD Student,
Statistics
Sunday, May 28, 2023, 15:00
- 16:00
Building 1, Level 4, Room 4102; https://kaust.zoom.us/j/7276313489
Contact Person
Latent Gaussian models (LGM) are widely used but struggle with certain datasets that contain non-Gaussian features, such as sudden jumps or spikes. This dissertation aims to provide tools for researchers to check the adequacy of the fitted LGM (criticism); if the check fails, offer efficient and user-friendly implementations of latent non-Gaussian models, which lead to more robust inferences (robustification).
Peter Rousseeuw, Professor Emeritus, Statistics and Data Science, KU Leuven, Belgium
Tuesday, May 09, 2023, 15:00
- 16:00
Building 9, Level 2, Room 2325
Contact Person
Classification is a major tool of statistics and machine learning. Several classifiers have interesting visualizations of their inner workings. Here we pursue a different goal, which is to visualize the cases being classified, either in training data or in test data. An important aspect is whether a case has been classified to its given class (label) or whether the classifier wants to assign it to a different class. This is reflected in the probability of the alternative class (PAC). A high PAC indicates label bias, i.e. the possibility that the case was mislabeled. The PAC is used to construct a silhouette plot which is similar in spirit to the silhouette plot for cluster analysis. The average silhouette width can be used to compare different classifications of the same dataset. We will also draw quasi residual plots of the PAC versus a data feature, which may lead to more insight in the data. One of these data features is how far each case lies from its given class, yielding so-called class maps. The proposed displays are constructed for discriminant analysis, k-nearest neighbors, support vector machines, CART, random forests, and neural networks. The graphical displays are illustrated and interpreted on data sets containing images, mixed features, and texts.
Peter Rousseeuw, Professor Emeritus, Statistics and Data Science, KU Leuven, Belgium
Tuesday, May 09, 2023, 12:00
- 13:00
Building 9, Level 2, Room 2325
Contact Person
A multivariate dataset consists of n cases in d dimensions, and is often stored in an n by d data matrix. It is well-known that real data may contain outliers. Depending on the situation, outliers may be (a) undesirable errors which can adversely affect the data analysis, or (b) valuable nuggets of unexpected information. In statistics and data analysis the word outlier usually refers to a row of the data matrix, and the methods to detect such outliers only work when at least half the rows are clean. But often many rows have a few contaminated cell values, which may not be visible by looking at each variable (column) separately.
We describe a method to detect deviating data cells in a multivariate sample which takes the correlations between the variables into account. It has no restriction on the number of clean rows, and can deal with high dimensions. Other advantages are that it provides predicted values of the outlying cells, while imputing missing values at the same time.
We illustrate the method on several real data sets, where it uncovers more structure than found by purely columnwise methods or purely rowwise methods. The proposed method can help to diagnose why a certain row is outlying, e.g. in process control. It also serves as an initial step for estimating multivariate location and scatter matrices, and for cellwise robust principal component analysis.
Postdoctoral Fellow,
Statistics
Monday, May 01, 2023, 12:00
- 13:00
Building 9, Level 3, Room 3128
The Mardia measures of multivariate skewness and kurtosis summarize the respective characteristics of a multivariate distribution with two numbers. However, these measures do not reflect the sub-dimensional features of the distribution. Consequently, testing procedures based on these measures may fail to detect skewness or kurtosis present in a sub-dimension of the multivariate distribution. We introduce sub-dimensional Mardia measures of multivariate skewness and kurtosis, and investigate the information they convey about all sub-dimensional distributions of some symmetric and skewed families of multivariate distributions.
PhD Student,
Statistics
Tuesday, April 04, 2023, 16:00
- 19:00
Building 4, Level 5, Room 5220; https://kaust.zoom.us/j/95859608188
Contact Person
This Ph.D. research focuses on proposing new statistical methods for two types of time series data: integer-valued data and multivariate nonstationary extreme data. For the former, the researcher proposes a novel approach to building an integer-valued autoregressive (INAR) model that offers the flexibility to specify both marginal and innovation distributions, leading to several new INAR processes. For the latter, the researcher proposes new extreme value theory methods for analyzing multivariate nonstationary extreme data, specifically EEG recordings from patients with epilepsy. Two extreme-value methods, Conex-Connect and Club Exco, are proposed to study alterations in the brain network during extreme events such as epileptic seizures.
PhD Student,
Statistics
Tuesday, March 28, 2023, 16:00
- 19:00
Building 4, Level 5, Room 5220; https://kaust.zoom.us/j/97763748127
Contact Person
Risk assessment for natural hazards and financial extreme events requires the statistical analysis of extreme events, often beyond observed levels. The characterization and extrapolation of the probability of rare events rely on assumptions about the extremal dependence type and about the specific structure of statistical models. In this thesis, we develop models with flexible tail dependence structures, in order to provide a reliable estimation of tail characteristics and risk measures. Our novel methodologies are illustrated by a range of applications to financial, climatic, and health data.
Prof. Victor DeOliveira, Professor in Department of Management Science and Statistics in the Carlos Alvarez College of Business
Wednesday, March 15, 2023, 15:00
- 16:00
Building 1, Level 4, Room 4102
Contact Person
The Mat\'ern family of covariance functions is currently the most commonly used for the analysis of geostatistical data due to its ability to describe different smoothness behaviors. Yet, in many applications the smoothness parameter is set at an arbitrary value.
Monday, March 13, 2023, 12:00
- 13:00
Building 9, Level 3, Room 3128
We study theoretical problems of fault diagnosis in circuits and switching networks, which are among the most fundamental models for computing Boolean functions. We investigate two main cases: when the scheme (circuit or switching network) has the same mode of operation for both calculation and diagnostics, and when the scheme has two modes of operation -normal for calculation and special for diagnostics.
Maurizio Filippone, Associate Professor, EURECOM, France
Monday, March 06, 2023, 12:00
- 13:00
Building 9, Level 3, Room 3128
Contact Person
The impressive success of Deep Learning (DL) in predictive performance tasks has fueled the hopes that this can help addressing societal challenges by supporting sound decision making. However, many open questions remain about their suitability to hold up to this promise. In this talk, I will discuss some of the current limitations of DL, which directly affect their wide adoption. I will focus in particular on the poor ability of DL models to quantify uncertainty in predictions, and I will present Bayesian DL as an attractive approach combining the flexibility of DL with probabilistic reasoning. I will then discuss the challenges associated with carrying out inference and specifying sensible priors for DL models. After presenting a few of my contributions to address these problems, I will conclude by presenting some interesting emerging research trends and open problems which define my current research agenda.
Prof. Ioannis Papastathopoulos, Lecturer in Statistics, University of Edinburgh
Tuesday, February 28, 2023, 09:00
- 16:00
Building 1, Level 4, Room 4102
Contact Person
Refined characterizations of the probabilistic behavior of a stationary time-series by focusing on re normalized Markov processes that are conditioned to attain an extreme event, subject to the level of the extremity tending to the upper end point of the marginal distribution