Thursday, April 08, 2021, 11:00
Machine learning is emerging as a powerful tool to data science and is being applied in almost all subjects. In many applications, the number of features is comparable or even larger than the number of samples, and both grow large. This setting is usually named the high-dimensional regime. In this regime, new challenges and questions arise when it comes to the application of machine learning. In this work, we conduct a high-dimensional performance analysis of some popular classification and regression techniques. In a first part, discriminant analysis classifiers are considered. A major challenge towards the use of these classifiers in practice is that they depend on the inverse of covariance matrices that need to be estimated from training data. Several estimators for the inverse of the covariance matrices can be used. The most common ones are estimators based on the regularization approach. The main advantage of such estimators is their resilience to the sampling noise, making them suitable to high-dimensional settings. In this thesis, we propose new estimators that are shown to yield better performance.