By Professor Peter Richtarik (KAUST)
While this is not the primary goal of this talk, it can serve as a modern introduction into the field of randomized algorithms in optimization.
We propose a new paradigm for solving linear systems with a very large number of equations. In our paradigm, the system is first reformulated into a stochastic problem, and then solved with a randomized algorithm.
Our stochastic reformulation is flexible as it depends on a user-defined parameter in the form of a distribution defining an ensemble of random matrices. The choice of the distribution directly influences the “condition number” of the reformulation, which leads to the novel concept of “randomized preconditioning”. We give necessary and sufficient conditions for the reformulation to be exact, i.e., for the solution set of the stochastic problem to be identical to the solution set of the linear system. We also show that the reformulation can be equivalently seen as a stochastic optimization problem, stochastically preconditioned linear system, stochastic fixed-point problem and as a probabilistic intersection problem. For instance, the condition number of the reformulation is equal to the condition number of the stochastically preconditioned linear system, and to the condition number of associated with the Hessian of the objective function appearing the stochastic optimization reformulation.
Further, we propose and analyze basic, parallel and accelerated stochastic algorithms for solving the reformulated problem, with linear convergence rates. The methods have natural and sometimes surprising interpretations from the viewpoint of each of the four reformulations. For instance, the methods can be interpreted as basic, parallel and accelerated variants of stochastic gradient descent, stochastic Newton descent, stochastic projection method and stochastic fixed-point method. The complexity of the basic variants scales linearly with the condition number of the reformulation, while the accelerated variants scale with the square root of the condition number.
Moreover, all our methods lend themselves to a natural dual interpretation as “stochastic subspace ascent” methods, a novel class of optimization algorithms not analyzed before. Stochastic dual coordinate ascent and stochastic dual Newton ascent arise in special cases. We prove global linear convergence of all our algorithms. Further, we highlight a close connection to recent algorithmic developments in machine learning through casting the problem as an instance of the Empirical Risk Minimization problem in a new regime not studied before.
The above development can be extended to matrix inversion. In particular, we develop and analyze a broad family of stochastic/randomized algorithms for inverting a matrix, with specialized variants maintaining symmetry and/or positive definiteness of the iterates. All methods in the family converge globally and linearly, with explicit rates. In special cases, we obtain stochastic block variants of several quasi-Newton updates, including bad Broyden (BB), good Broyden (GB), Powell-symmetric-Broyden (PSB), Davidon-Fletcher-Powell (DFP) and Broyden-Fletcher-Goldfarb-Shanno (BFGS). Ours are the first stochastic versions of these updates shown to converge to an inverse of a fixed matrix. Through a dual viewpoint we uncover a fundamental link between quasi-Newton updates and approximate inverse preconditioning. Further, we develop an adaptive variant of randomized block BFGS, where we modify the distribution underlying the stochasticity of the method throughout the iterative process to achieve faster convergence. Further, for rectangular and non-invertible matrices, variants of our methods can be shown to converge to the Moore-Penrose pseudoinverse.
Biography: Peter Richtarik is an Associate Professor of Computer Science and Mathematics at KAUST and an Associate Professor of Mathematics at the University of Edinburgh. He is an EPSRC Fellow in Mathematical Sciences, Fellow of the Alan Turing Institute, and is affiliated with the Visual Computing Center and the Extreme Computing Research Center at KAUST. Dr. Richtarik received his PhD from Cornell University in 2007, and then worked as a Postdoctoral Fellow in Louvain, Belgium, before joining Edinburgh in 2009, and KAUST in 2017. Dr. Richtarik’s research interests lie at the intersection of mathematics, computer science, machine learning, optimization, numerical linear algebra, high performance computing and applied probability. Through his recent work on randomized decomposition algorithms (such as randomized coordinate descent methods, stochastic gradient descent methods and their numerous extensions, improvements and variants), he has contributed to the foundations of the emerging field of big data optimization, randomized numerical linear algebra, and stochastic methods for empirical risk minimization. Several of his papers attracted international awards, including the SIAM SIGEST Best Paper Award, the IMA Leslie Fox Prize (2nd prize, twice), and the INFORMS Computing Society Best Student Paper Award (sole runner up). He is the founder and organizer of the Optimization and Big Data workshop series.
For more info contact: Professor Peter Richtarik : email: firstname.lastname@example.org
Date: Thursday 13th Apr 2017
Time:12:00 PM - 01:00 PM
Location: Building 9, Hall 1
Light Lunch will be available at 11:45 AM