By David Murphy, KAUST News
Peter Richtárik, KAUST professor of computer science, recently received a Distinguished Speaker Award at the Sixth International Conference on Continuous Optimization (ICCOPT 2019) held in Berlin from August 3 to 8. ICCOPT 2019 was organized by the Mathematical Optimization Society and was hosted this year by the Weierstrass Institute for Applied Analysis and Stochastics.
Richtárik was recognized for delivering an invited lecture series titled "A Guided Walk Through the ZOO of Stochastic Gradient Descent Methods." His lectures, totaling six hours, focused on highlighting key principles behind how stochastic gradient descent (SGD) works. Many of the insights came from recent research carried out by Richtárik and his KAUST research group.
"SGD is the key method for training modern machine learning models. Despite its huge popularity, it turns out this method was not analyzed the right way and [was] not well understood," Richtárik explained. "My course was based on research done in my group across multiple papers and brought new insights into how this method works. It also highlighted several new variants never considered before—with interesting properties and applications."
Regarding his Distinguished Speaker Award, Richtárik said: "I was the first one to have received [the] award at the conference, and hence I did not expect this at all. I was surprised, humbled and honored. You always feel nice when you're appreciated.
"[The] prize was based on joint research with mostly my students and postdocs. I am really pleased to share this news with my group [members] because they will know that the work that we do is appreciated. I hope this award will act as a motivator for them."
Unpacking artificial intelligence, machine learning and SGD
Machine learning is an artificial intelligence (AI) branch based on the concept that computer systems can learn information, recognize patterns and make decisions with minimal human involvement. Increasingly, many leading international companies are becoming aware of the unlimited potential of AI, machine learning and deep learning. Richtárik noted that SGD is the state-of-the-art method behind training machine learning models.
"Whenever you hear something in the media about machine learning and its success, much of the success is due to our ability to train these models, and they are trained using the SGD algorithm in one disguise or another," he said. "It's actually an infinite family of algorithms—we still don't know all of them."
Due to each machine learning model having different properties and the sheer number of variants that currently exist, it is becoming increasingly difficult for researchers to understand what is known and not known. This presents an intriguing mathematical and computational challenge for Richtárik and his colleagues at KAUST to solve, and it also motivated the production of the lecture series.
"I focus on developing algorithms [that] work in the big data setting, and there are many ways in which you could define what that means," Richtárik stated. "Put simply, something has to be very difficult and significant about the optimization problem for me to be interested."
"There are many ways in which an optimization problem can be big," he continued. "The way this appears in machine learning is that you collect lots of data, and with each data point, you associate a so-called loss function because you want the model that you train to predict not far from the truth. You then take the average over all of these loss functions over all the data, and then [you] attempt to minimize this over all models so as to identify the best model."
"Choosing these parameters is an optimization problem, and anytime you minimize some function, it's an optimization problem," he added. "Since the function is an average over a huge number of other functions and collected data, it's a big data optimization problem. One can't use traditional techniques, and SGD is the state-of-the-art solution to address this problem."
Past and present: Helping to invent Federated Optimization
Before joining KAUST in 2017, Richtárik first obtained a master's degree in mathematics at Comenius University in his native Slovakia. In 2007, he received his Ph.D. in operations research from Cornell University before joining the University of Edinburgh in 2009 as an assistant professor at the university's School of Mathematics. In his time at KAUST, Richtárik feels he's managed to roughly double the size of his research group's annual output—due chiefly to his ability to appoint more students than he previously could in his research career.
"I always had loads of ideas, but I never had the time to think about all of them," Richtárik said. "Through the help of my students and postdocs, we can address more ideas and problems. I am leading an AI committee at KAUST [that] is helping with the [University's] AI initiative, and I'm trying to help other colleagues at KAUST who are not yet familiar with machine learning to get into the topic through collaboration. This is mutually rewarding, as we have very smart faculty [members] and students."
Moving forward, Richtárik is particularly excited about an approach to machine learning called Federated Optimization. The approach—which was pioneered by Richtárik with his former Edinburgh student Jakub Konečný (who is now working at Google) and a team at Google in Seattle—trains machine learning models without the need to send private data (stored from mobile devices) to company servers.
"We want to train machine learning models without the data ever leaving the mobile device to prevent any privacy leaks. This is called federated learning," Richtárik explained. "Google, in collaboration with us, created a huge system out of this. Currently, federated learning is in use by more than a billion mobile devices worldwide."
Read the full article