Discriminant Analysis and Support Vector Regression in High Dimensions: Sharp Performance Analysis and Optimal Designs

Event Start
Event End
Location
KAUST

Abstract

Machine learning is emerging as a powerful tool to data science and is being applied in almost all subjects. In many applications, the number of features is comparable or even larger than the number of samples, and both grow large. This setting is usually named the high-dimensional regime. In this regime, new challenges and questions arise when it comes to the application of machine learning. In this work, we conduct a high-dimensional performance analysis of some popular classification and regression techniques. In a first part, discriminant analysis classifiers are considered. A major challenge towards the use of these classifiers in practice is that they depend on the inverse of covariance matrices that need to be estimated from training data. Several estimators for the inverse of the covariance matrices can be used. The most common ones are estimators based on the regularization approach. The main advantage of such estimators is their resilience to the sampling noise, making them suitable to high-dimensional settings. In this thesis, we propose new estimators that are shown to yield better performance. The main principle of our proposed approach is the design of an optimized inverse covariance matrix estimator based on the assumption that the covariance matrix is a low-rank perturbation of a scaled identity matrix. We show that not only the proposed classifiers are easier to implement but also, as evidenced by numerical simulations, they outperform the classical regularization-based discriminant analysis classifiers. In a second part, we carry out a high dimensional statistical analysis of linear support vector regression. Under some plausible assumptions on the statistical distribution of the data, we characterize the feasibility condition for the hard support vector regression in the regime of high dimensions and, when feasible, derive an asymptotic approximation for its risk. Similarly, we study the test risk for the soft support vector regression as a function of its parameters. Our results are then used to optimally tune the parameters intervening in the design of hard and soft support vector regression algorithms. Based on our analysis, we illustrate that adding more samples may be harmful to the test performance of support vector regression, while it is always beneficial when the parameters are optimally selected. Such a result reminds of a similar phenomenon observed in modern learning architectures according to which optimally tuned architectures present a decreasing test performance curve with respect to the number of samples. The analysis is then extended to the case of kernel support vector regression under generalized linear models assumption. Our results pave the way to understand the effect of the underlying hyper-parameters and provide insights on how to optimally choose the kernel function and the different hyper-parameters.

Brief Biography

Houssem Sifaou is a Ph.D. candidate in the Electrical and Computer Engineering Program at King Abdullah University of Science and Technology. He received the Engineering degree in signals and systems from Tunisia Polytechnic School, La Marsa, Tunisia, in 2014 and the M.S. degree in electrical engineering from King Abdullah University of Science and Technology in 2016. His research interests include asymptotic performance analysis of machine learning techniques, massive MIMO, visible light communication, and random matrix theory applications in wireless communications and machine learning.

Contact Person