Monday, March 30, 2020, 18:00
In this dissertation, we aim at theoretically studying and analyzing deep learning models. Since deep models substantially vary in their shapes and sizes, in this dissertation, we restrict our work to a single fundamental block of layers that is common in almost all architectures. The block of layers of interest is the composition of an affine layer followed by a nonlinear activation function and then lastly followed by another affine layer. We study this block of layers from three different perspectives. (i) An Optimization Perspective. We try addressing the following question: Is it possible that the output of the forward pass through the block of layers highlighted above is an optimal solution to a certain convex optimization problem? As a result, we show an equivalency between the forward pass through this block of layers and a single iteration of certain types of deterministic and stochastic algorithms solving a particular class of tensor formulated convex optimization problems.