In this talk we develop a method for simultaneous estimation of density functions for a collection of populations of protein backbone angle pairs. Each log density function in the collection is modeled as a linear combination of a common set of basis functions. The shared basis functions are modeled as bivariate splines on triangulations and are estimated using data. The circular nature of angular data is taken into account by imposing appropriate smoothness constraints across boundaries of the triangles. Maximum penalized likelihood is used to t the model and an alternating block-wise Newton-type algorithm is developed for computation. A simulation study shows that the collective estimation approach is statistically more efficient than estimating the densities separately. The proposed method is applied to estimate neighbor-dependent distributions of protein backbone dihedral angles (i.e., Ramachandran distributions). Our estimated distributions show competitive performance when used for angular-sampling-based protein structure prediction.
Mehdi Maadooliat received the Ph.D. degree in Statistics from the Texas A&M University, where he also served as a post-doctoral fellow. His doctoral research focuses on the dimension reduction and data analysis in non-Gaussian frameworks. Recently he is working on modeling of the large spherical data structures with an application in protein structure prediction and classification. His primary research interests include bioinformatics, machine learning, functional data analysis and skewed distributions. He is an Associate Editor for the Journal of Statistical Theory and Applications.