Stein Variational Gradient Descent and Consensus-Based Optimization: Towards a Convergence Analysis and Generalization

The first part of the dissertation presents a study on the convergence properties of Stein Variational Gradient Descent (SVGD), a sampling algorithm with applications in machine learning. The research delves into the theoretical analysis of SVGD in the population limit, focusing on its behavior under various conditions, including the Talagrand’s inequality T1 and the (L0, L1)−smoothness condition. The study also introduces an improved version of SVGD with importance weights, demonstrating its potential to accelerate convergence and enhance stability.

Overview

Abstract

The first part of the dissertation presents a study on the convergence properties of Stein Variational Gradient Descent (SVGD), a sampling algorithm with applications in machine learning. The research delves into the theoretical analysis of SVGD in the population limit, focusing on its behavior under various conditions, including the Talagrand’s inequality T1 and the (L0, L1)−smoothness condition. The study also introduces an improved version of SVGD with importance weights, demonstrating its potential to accelerate convergence and enhance stability.

In the second part, the dissertation explores the convergence of Consensus-Based Optimization (CBO) methods. We first propose the Consensus-Based Optimization with truncated noise, for this method, we provide theoretical guarantees for global convergence to the global minimizer of nonconvex and nonsmooth objective functions. We also design a CBO dynamic to find the global minimizers of objectives with multiple global minimizers. For this dynamic, we provide the theoretical guarantee that the dynamic will concentrate around the set of global minimizers of the target objectives.

Brief Biography

Lukang Sun is a Ph.D. student majoring in Computer Science under the supervision of Prof. Peter Richtarik since 2021. At KAUST, Lukang Sun’s research focuses on interacting particle systems and their applications to machine learning, engineering, etc.

Presenters