Machine Learning in Healthcare: When Low Sample Size is not a Limitation

Seminar

Event Start

2021-08-17 - 14:30

Event End

2021-08-17 - 15:30

Location

KAUST

Dr. Ricardo Henao, Biostatistics and Bioinformatics, Duke University

Abstract

Methodology at the intersection of machine learning and medicine has been rapidly advancing in recent years. Especially, in scenarios when large amounts of high-quality data for supervised tasks is available. However, in most practical situations in medicine, data is scarce, difficult to access, the endpoints of interest are rare or the process for generating labeled data for supervised tasks is difficult, time consuming and expensive, all of which make it difficult to assemble large datasets for supervised tasks. In this talk, I will describe three use cases that highlight present challenges and opportunities for the development of machine learning methodology for applications in healthcare. First, I will describe the development of simple word embedding approaches for bag of-documents classification and its applications to diagnosis of peripheral artery disease from clinical narratives. Second, I will present an approach for volumetric image classification that leverages attention mechanisms, contrastive learning and feature-encoding sharing for geographic atrophy prognosis from optical coherence tomography images. Third, I will discuss machine learning approaches for multi-modal and multi-dataset integration for biomarker discovery from molecular (omics) data. To conclude, I will summarize the contributions and insights in each of these different directions in which relatively low sample sizes are the common denominator.

Brief Biography

Ricardo Henao, a quantitative scientist, is an Assistant Professor in the department of Biostatistics and Bioinformatics at Duke University. He is also affiliated with the Department of Electrical and Computer Engineering (ECE), the Information Initiative at Duke (iiD), the Center for Applied Genomics and Precision Medicine (CAGPM), the Forge Duke’s center for Actionable health Data Science and the Duke Clinical Research Institute (DCRI), all at Duke University. The theme of his research is the development of statistical methods and machine learning algorithms primarily based on probabilistic modeling. His expertise covers several fields including applied statistics, signal processing, pattern recognition and machine learning. His methods research focuses on hierarchical or multilayer probabilistic models to describe complex data, such as that characterized by high-dimensions, multiple modalities, more variables than observations, noisy measurements, missing values, time-series, multiple modalities, etc. Most of his applied work is dedicated to the analysis of biological data such as gene expression, medical imaging, clinical narrative, and electronic health records. His recent work has been focused on the development of machine learning models, including deep learning approaches, for the analysis and interpretation of clinical and biological data with applications to predictive modeling for diverse clinical outcomes

Event Start

Event End

Location

Abstract

Brief Biography

Events

CEMSE - Computer, Electrical and Mathematical Sciences and Engineering Division

Biological and Environmental Sciences Engineering Division

Physical Science and Engineering Division

Study

Expanding Knowledge

Student Affairs

Living in KAUST

About KAUST

Latest from KAUST

Statistics Program

STAT

Statistics
@KAUST

Machine Learning in Healthcare: When Low Sample Size is not a Limitation

Event Start

Event End

Location

Abstract

Brief Biography

CEMSE - Computer, Electrical and Mathematical Sciences and Engineering Division

Biological and Environmental Sciences Engineering Division

Physical Science and Engineering Division

Study

Expanding Knowledge

Student Affairs

Living in KAUST

About KAUST

Latest from KAUST

Statistics Program

STAT

Statistics @KAUST

Statistics
@KAUST