First Provably Optimal Asynchronous SGD for Homogeneous and Heterogeneous Data

This talk will discuss how to design asynchronous optimization methods that remain fast, stable, and even provably optimal.

Overview

Training today’s large AI models can take weeks or even months on thousands of GPUs, consuming enormous amounts of computation and energy. While hardware keeps getting faster, the algorithms that coordinate this training haven’t kept up. Most still rely on synchronous updates—everyone waits for the slowest machine—wasting both time and resources.

A natural idea is to remove synchronization and let each machine work independently. But this creates new challenges: updates arrive at different times, use stale information, and make the mathematics of convergence far more complex.

In this talk, I’ll discuss how we can design asynchronous optimization methods that remain fast, stable, and even provably optimal. I’ll show that with the right structure, coordination without synchronization is not only possible, but can achieve the best possible time complexity—training large models faster and more efficiently than today’s standard methods.

Presenters

Brief Biography

Artavazd Maranjyan is a Ph.D. Candidate in Computer Science at King Abdullah University of Science and Technology (KAUST), advised by Prof. Peter Richtárik.

His work has been published in leading machine learning conferences such as ICML, ICLR, and UAI. He has received the CEMSE Dean’s List Award for academic and research excellence, and the Dean’s List Award upon joining KAUST.

Before starting his Ph.D., Artavazd earned his M.S. and B.S. degrees from Yerevan State University, where he co-authored several papers in harmonic analysis under the guidance of Prof. Martin Grigoryan.