By David Murphy
KAUST Professor of Computer Science Peter Richtárik and his former student Nicolas Loizou, currently a postdoctoral researcher at Mila - Quebec Artificial Intelligence Institute and soon to take up an assistant professorship position at Johns Hopkins University, recently received the 2020 Computational Optimization and Applications (COAP) Best Paper Award.
Ranked one of the top ten international optimization journals, COAP publishes research on the analysis and development of computational algorithms and modeling technology for optimization.
The journal’s editorial board voted the KAUST-authored paper titled "Momentum and stochastic momentum for stochastic gradient, Newton, proximal point and subspace descent methods" as the best paper of 2020 from a field of 93 featured papers. The winning paper studied several stochastic optimization algorithms enriched with the so-called “heavy ball momentum”—a convergence acceleration trick originally proposed by the Russian mathematician Boris Polyak in 1964.
Over the last decade, this optimization trick has been immensely popular in the machine learning community as it helps to speed up the training of modern machine learning models. Such models are typically trained by an optimization method called stochastic gradient descent (SGD), which trains the model by teaching it one (random) data example at a time. The heavy ball method has been observed in practice to significantly improve the training time and quality of the trained model.
“In our paper, we managed to prove mathematically that SGD (and many other methods) benefit from the heavy ball momentum in the way researchers and practitioners believed it to work but were unable to show conclusively before that this is the case,” Richtárik explained.
“We are naturally very honored to have received this award. Looking at the list of past recipients, we are in excellent company. The COAP award is an enormous morale boost for our future research endeavors into popular optimization tricks currently used in the machine learning community, but which are not yet understood theoretically,” he concluded.