Communication-Efficient Distributed Newton-type Algorithms for Federated Learning
In this talk, I will discuss communication compression and aggregation mechanisms for curvature information in order to reduce these costs while preserving theoretically superior local convergence guarantees.
Overview
Abstract
In the quest for high-accuracy machine learning models, both the size of the model and consequently the amount of data necessary to train the model have been hugely increased over time. Because of this, performing the learning process on a single machine is often infeasible. In a typical scenario of distributed learning, the training data is spread across different machines, and thus the process of training is done in a distributed manner. Another scenario, most common to federated learning, is when training data is inherently distributed across a large number of mobile edge devices due to data privacy concerns. Therefore, large-scale distributed optimization has become the default tool for the training of supervised machine learning models with a large number of parameters and training data.
In all cases of distributed learning and federated learning, information (e.g., current stochastic gradient vector or current state of the model) communication between computing devices is inevitable, which forms the primary bottleneck of such systems. This issue is especially apparent in federated learning, where computing nodes are devices with essentially inferior power, and the network bandwidth is considerably slow.
Despite their high computation and communication costs, Newton-type methods remain an appealing option for distributed training due to their robustness against ill-conditioned convex problems. In this talk, I will discuss communication compression and aggregation mechanisms for curvature information in order to reduce these costs while preserving theoretically superior local convergence guarantees.
Brief Biography
Mher Safaryan is a Postdoctoral Research Fellow working with Professor Peter Richtarik at the Visual Computing Center (VCC) at King Abdullah University of Science and Technology (KAUST). Mher received his Ph.D. in Mathematics from Yerevan State University, Armenia under the supervision of Prof. Grigori Karagulyan. Before joining KAUST, he was a Junior Researcher at the Institute of Mathematics of the National Academy of Sciences, Real Analysis Department in Armenia. After that, he joined the Computer, Electrical, and Mathematical Sciences and Engineering Division (CEMSE) as a Research Technician before joining Prof. Peter Richtarik Research Group in 2019.