AI4GH Seminar Series - Vector Representation of Biological Entities Based on Ontologies and Their Annotations

Biological knowledge is widely represented in the form of ontology-based annotations: ontologies describe the phenomena assumed to exist within a domain, and the annotations associate a biological entity with a set of phenomena within the domain.

Overview

Abstract

Biological knowledge is widely represented in the form of ontology-based annotations: ontologies describe the phenomena assumed to exist within a domain, and the annotations associate a biological entity with a set of phenomena within the domain. The structure and information contained in ontologies and their annotations makes them valuable for developing machine learning, data analysis, and knowledge extraction algorithms. In addition to formally structured axioms, ontologies contain meta-data in the form of annotation axioms that provide valuable pieces of information that characterize ontology classes. Annotation axioms commonly used in ontologies include class labels, descriptions, or synonyms. Despite being a rich source of semantic information, the ontology meta-data are generally unexploited by ontology-based analysis methods.
We propose an approach to learn feature vectors for biological entities based on their annotations to biomedical ontologies and ontology metadata. We validate our methods through two main experiments: prediction of protein-protein interaction for human and yeast and gene-disease association prediction on human and mouse datasets. 

Brief Biography

Fatima Zohra Smaili is a Ph.D. student of Professor Xin Gao at CBRC. Her research involves machine learning applications in Bioinformatics including biomedical ontologies analysis, protein function prediction, and gene-disease association prediction.

More Information:

​Light lunch will be provided.

Presenters