Ringmaster ASGD: The First Asynchronous SGD with Optimal Time Complexity
B9, L2, R2325
Asynchronous Stochastic Gradient Descent (Asynchronous SGD) is a cornerstone method for parallelizing learning in distributed machine learning. However, its performance suffers under arbitrarily heterogeneous computation times across workers, leading to suboptimal time complexity and inefficiency as the number of workers scales.
Overview
While several Asynchronous SGD variants have been proposed, recent findings by Tyurin & Richtárik (NeurIPS 2023) reveal that none achieve optimal time complexity, leaving a significant gap in the literature. In this paper, we propose Ringmaster ASGD, a novel Asynchronous SGD method designed to address these limitations and tame the inherent challenges of Asynchronous SGD. We establish, through rigorous theoretical analysis, that Ringmaster ASGD achieves optimal time complexity under arbitrarily heterogeneous and dynamically fluctuating worker computation times. This makes it the first Asynchronous SGD method to meet the theoretical lower bounds for time complexity in such scenarios.
Presenters
Brief Biography
Arto Maranjyan is a PhD student at KAUST, advised by Peter Richtárik. Recently, his focus has been on the theory and design of asynchronous optimization methods—developing efficient, scalable, and theoretically grounded algorithms. More broadly, he works on optimization for machine learning and federated learning. Arto earned his M.s. and B.S. degrees from Yerevan State University. During my bachelor’s studies, he co-authored several papers in Harmonic Analysis under the guidance of Prof. Martin Grigoryan.