AI in Cancer Precision Medicine Jul 25, 08:00 - Jul 26, 17:00 B3 R5220 About The workshop on " AI in Cancer Precision Medicine" is jointly organized by researchers from the CBRC, the University of Birmingham, and the University of Cambridge, as part of the KAUST-sponsored Center Partnership program. At the workshop, we will investigate how novel AI technologies, including progress in machine learning, knowledge representation, and reasoning can be applied to improving diagnosis and treatment of cancer in the era of genomic medicine. Speakers Dr. Paul Schofield, Dr. Adeeb Noor, Dr. Andreas Karwath. Agenda The workshop is open to all on Wednesday the 25th of July
AI in Cancer Precision Medicine Workshop Dr. Paul Schofield Jul 25, 08:00 - Jul 26, 13:00 B3 R5220 We will investigate how novel AI technologies, including progress in machine learning, knowledge representation and reasoning can be applied to improving diagnosis and treatment of cancer in the era of genomic medicine.
Bio2Vec Kickoff Meeting Apr 25, 08:00 - Apr 26, 17:00 Paris France big data machine learning Data Analytics Research Scientist Senay Kafkas will be attending the Bio2Vec kickoff meeting in Paris, France.
Neural Inductive Matrix Factorization for Predicting Disease-Gene Associations Siqing Hou, M.S., Computer Science Apr 18, 10:00 - 11:30 B3 R5208 bioinformatics machine learning Disease-Gene Associations In silico prioritization of undiscovered associations can help find causal genes of newly discovered diseases. Some existing methods are based on known associations and side information of diseases and genes. We exploit the possibility of using a neural network model, Neural Inductive Matrix Completion (NIMC) in disease-gene prediction.
Ontology Design Patterns for Combining Pathology and Anatomy: Application to Study Ageing and Longevity in Inbred Mouse Strains Sarah Alghamdi, Ph.D. Student, Computer Science Apr 10, 13:00 - 14:30 B9 R3120 biomedicine Ontologies data analysis semantic analysis computation techniques Abstract In biomedical research, ontologies are widely used to represent knowledge as well as annotate datasets. Many of the existing ontologies cover a single type of phenomena, such as a process, cell type, gene, pathological entity or anatomical structure. Consequently, it is required to use multiple ontologies to fully characterize the observations in the datasets. Although this allows precise annotation of different aspects of a given dataset, it limits our ability to use the ontologies in data analysis, as the ontologies are usually disconnected and their combination cannot be exploited
Big Data in Biodiversity and Health Mar 26, 09:30 - Mar 28, 13:30 B3 L5 5209 About We are witnessing today an enormous increase in the volume and complexity of data across a variety of domains, including bioscience. Extracting useful information from such data is challenging. Although many approaches have already been developed, efficient analysis of big data in bioscience domain is far from satisfactory. Biodiversity and health are prominently characterized by a high volume of data with great complexity of information contained, which lead to various approaches to data analyses. The goal of this workshop is to present a selection of efforts currently being made at
Computational and Statistical Interface to Big Data Xin Gao, Program Chair, Computer Science Mar 19, 08:00 - Mar 21, 17:00 B9 L2 H2 We are now in the fourth paradigm of science: Data Science. The massive amount of structured and unstructured data has posed new challenges and opportunities to the fields of computer science and statistics. Traditional computational and statistical methods for data storage, curation, sharing, querying, updating, visualization, analysis, and privacy have been shown to fail in the big data scenario due to the unprecedented volume, velocity, variety, veracity and value of the big data. This conference will bring together a number of prominent researchers in Computer Science and Statistics with common interests and active research in big data, as well as the researchers at KAUST who regularly generate or face big data, such as those in bioscience and red sea research.
Symbolic AI in Computational Biology Robert Hoehndorf, Associate Professor, Computer Science Mar 12, 12:00 - 13:00 B9 H1 R2322 About The life sciences have invested significant resources in the development and application of semantic technologies to make research data accessible and interlinked, and to enable the integration and analysis of data. Utilizing the semantics associated with research data in data analysis approaches is often challenging. Now, novel methods are becoming available that combine symbolic methods and statistical methods in Artificial Intelligence. In my talk, I will describe how to combine symbolic and statistical Artificial Intelligence approaches for the analysis of biological and biomedical
Causality-based new drug development: Some successful cases and a new challenge Naoyuki Kamatani, MD, PhD Mar 11, 12:00 - 13:00 B2 L5 R5209 artificial intelligence biomedicine drug development Abstract The success rate of new drug development is extremely low. It is even lower when the new drug has a novel mechanism of action. From my experience, I propose that the success rate can be dramatically increased by predicting the effects of a drug in humans based on the causality-confirmed data. It is dangerous to develop new drugs based on the data in which causality is not confirmed. In biology, there are three different types of relationships in which the causality is confirmed, i.e. the relationships between parent and child, between gene and phenotype and between intervention and
Symbolic AI in Computational Biology: Applications to Disease Gene and Drug Target Identification Robert Hoehndorf, Associate Professor, Computer Science Feb 26, 16:30 - 17:30 The University of Cambridge in the United Kingdom Abstract KAUST Assistant Professor Robert Hoehndorf will give a seminar on " Symbolic AI in Computational Biology: Applications to Disease Gene and Drug Target Identification" at the University of Cambridge in the United Kingdom. More Information The life sciences have invested significant resources in the development and application of semantic technologies to make research data accessible and interlinked, and to enable the integration and analysis of data. Utilizing the semantics associated with research data in data analysis approaches is often challenging. Now, novel methods are
Keynote Speaker | The 8th BEAR PGR Conference & Users Forum 2018 Robert Hoehndorf, Associate Professor, Computer Science Feb 23, 09:00 - 16:30 The University of Birmingham in the United Kingdom High Performance Computing cloud storage data visualisation Abstract KAUST Assistant Professor Robert Hoehndorf will be a keynote speaker at the 8th BEAR PGR Conference & Users Forum at the University of Birmingham in the United Kingdom. This event focuses on, but not limited to, computational analysis and numerical modeling the conference will cater to researchers of all schools interested in BEAR facilities. Such as the use of the high-performance computing (HPC) system, BlueBEAR, cloud storage or data visualization.
Keynote Presentation - The 8th BEAR PGR Conference & Users Forum 2018 Feb 23, 09:00 - 16:30 University of Birmingham United Kingdom computational analysis KAUST Assistant Professor Robert Hoehndorf will be a keynote speaker at the 8th BEAR PGR Conference & Users Forum at the University of Birmingham in the United Kingdom.
Fifth KAUST-NVIDIA Workshop on Accelerating Scientific Applications Using GPUs Timothy Lanfear , Brent Leback Feb 18, 08:00 - Feb 20, 17:00 B4 B5 A0215 supercomputing The KAUST Supercomputing Laboratory is co-organizing with NVIDIA, a leader in accelerated computing and artificial intelligence, a full-day workshop on accelerating scientific applications using GPUs on Tuesday, February 20th, 2018 in the auditorium between buildings 4 and 5.
KAUST Research Workshop on Optimization and Big Data Peter Richtarik, Professor, Computer Science Feb 5, 08:00 - Feb 7, 05:00 B19 L3 H2 optimization machine learning Social Network Analysis asynchronous algorithms The age of "big data" is here: data of unprecedented sizes is becoming ubiquitous, which brings new challenges and new opportunities. With this comes the need to solve optimization problems of unprecedented sizes.
Novel Computational Methods to Predict Drug–target Interactions Using Graph Mining and Machine Learning Approaches Rawan Olayan, Ph.D., Computer Science Dec 11, 10:00 - 12:00 B3 L5 R5220 bioinformatics data integration data mining graph mining machine learning Abstract Computational drug repurposing aims at finding new medical uses for existing drugs. The identification of novel drug-target interactions (DTIs) can be a useful part of such a task. Finding computationally DTIs is a convenient strategy to identify potentially new DTIs at low cost with reasonable accuracy. However, the current DTI prediction methods suffer a high false positive prediction rate. Here, we present a comprehensive review of the recent progress in the field of DTI prediction from data-centric and algorithmic-centric perspectives that can help in constructing novel reliable
Big Data Analyses in Evolutionary Biology Dec 4, 08:00 - Dec 6, 17:00 B9 H2 big data Big data analysis evolutionary biology This event is organized by CBRC with financial support from the KAUST Office of Sponsored Research
Contributions to In Silico Genome Annotation Manal Kalkatawi, Ph.D., Computer Science Nov 9, 10:00 - 13:00 B3 L5 R5209 bioinformatics data mining machine learning Deep learning genomics Abstract Genome annotation is an important topic since it provides information for the foundation of downstream genomic and biological research. It is considered as a way of summarizing part of existing knowledge about the genomic characteristics of an organism. Annotating different regions of a genome sequence is known as structural annotation while identifying functions of these regions are considered as a functional annotation. In silico approaches can facilitate both tasks that otherwise would be difficult and time-consuming. This study contributes to genome annotation by introducing
Computational Methods for Large Spatio-temporal Datasets and Functional Data Ranking Jun 13, 09:00 - 10:00 B1 L4 R4102 computational statistics Environmental Statistics In this thesis defense, I will talk about two topics—computational methods for large spatial datasets and functional data ranking. Both are tackling the challenges of big and high-dimensional data.
PCCFD - Predictive Complex Computational Fluid Dynamics David Keyes, Senior Associate to the President, King Abdullah University of Science and Technology May 22, 08:45 - May 24, 05:00 B9 L2 H1 CFD algorithms applied mathematics numerical analysis Computer science The PCCFD workshop will focus on cutting-edge research in the field of algorithmic development for CFD and multi-scale complex flow simulations.
Mining Genome-Scale Growth Phenotype Data through Constant-Column Biclustering Majed Alzahrani, Ph.D., Computer Science May 17, 15:00 - 17:00 B3 L5 R5209 data mining machine learning Computational biology Growth phenotype profiling of genome-wide gene-deletion strains overstresses conditions can offer a clear picture that the essentiality of genes depends on environmental conditions. In this dissertation, we first demonstrate that detecting such "co-fit" gene groups can be cast as a less well-studied problem in biclustering, i.e., constant-column biclustering. Despite significant advances in biclustering techniques, very few were designed for mining in growth phenotype data.
Breaking the Boundaries: from Structure to Algorithms Vadim Lozin, Professor, University of Warwick, UK Apr 17, 14:00 - 15:00 KAUST maximum independent set line graphs boundary classes of graphs Abstract Finding a maximum independent set in a graph is an NP-hard problem. However, restricted to the class of line graphs this problem becomes polynomial-time solvable due to the celebrated matching algorithm of Jack Edmonds. What makes the problem easy in the class of line graphs and what other restrictions can lead to an efficient solution? To answer these questions, we employ the notion of boundary classes of graphs. In this talk, we shed some light on the structure of the boundary separating difficult instances of the problem from polynomially solvable ones and analyze algorithmic tools
Computational Methods for ChIP-seq Data Analysis and Applications Haitham M. Ashoor, Ph.D., Computer Science Apr 10, 16:00 - 17:30 B3 L5 5209 computation techniques machine learning bioinformatics data analysis Abstract The development of Chromatin immunoprecipitation followed by sequencing (ChIP-seq) technology has enabled the construction of genome-wide maps of protein-DNA interaction. Such maps provide information about transcriptional regulation at the epigenetic level (histone modifications and histone variants) and at the level of transcription factor (TF) activity. This dissertation presents novel computational methods for ChIP-seq data analysis and applications. The work of this dissertation addresses four main challenges. First, I address the problem of detecting histone modifications from
Genetic Algorithms for Optimization of Machine-learning Models and their Applications in Bioinformatics Arturo Magana Mora, Ph.D., Computer Science Apr 10, 13:00 - 15:00 B3 L5 R5209 machine learning data mining biology genetics bioinformatics Abstract Machine-learning (ML) techniques have been widely applied to solve different problems in biology. However, biological data are large and complex, which often results in extremely intricate ML models. Frequently, these models may have poor performance or may be computationally unfeasible. This study presents a set of novel computational methods and focuses on the application of genetic algorithms (GAs) for the simplification and optimization of ML models and their applications to biological problems. The dissertation addresses the following three challenges. The first challenge is
Novel Computational Methods that Facilitate Development of Cyanofactories for Free Fatty Acid Production by Olaa Motwalli Olaa A. Motwalli, Ph.D., Computer Science Apr 9, 16:00 - 17:00 B3 L5 R5209 machine learning bioinformatics graph mining genomics Abstract Finding a source from which high-energy-density biofuels can be derived at an industrial scale has become an urgent challenge for renewable energy production. Some microorganisms can produce free fatty acids (FFA) as precursors towards such high-energy-density biofuels. In particular, photosynthetic cyanobacteria are capable of directly converting carbon dioxide into FFA. However, current engineered strains need several rounds of engineering to reach the level of FFA production for it to be commercially viable. Thus, new chassis strains that require less engineering are needed
Novel Data Mining Methods for Virtual Screening of Biological Active Chemical Compounds by Othman Soufan Othman Soufan, Ph.D., Computer Science Nov 16, 14:00 - 15:00 H2 B9 machine learning data mining Computational biology biomedical applications Chemical compounds visualization Abstract Drug discovery is a process that takes many years and hundreds of millions of dollars to reveal a con dent conclusion about a specific treatment. Part of this sophisticated process is based on preliminary investigations to suggest a set of chemical compounds as candidate drugs for the treatment. Computational resources have been playing a significant role in this part through a step known as virtual screening. From a data mining perspective, the availability of rich data resources is key in training prediction models. Yet, the difficulties imposed by big expansion in data and its
Welcome at Kaust UQ School On Numerical Methods For Direct And Inverse Problems 2016 22-28 May Raul Tempone, Professor, Applied Mathematics and Computational Sciences May 22, 12:00 - May 28, 12:00 B9 H1 R2322 numerical methods stochastic differential equations statistics KAUST UQ SCHOOL 2016 is an annual base thematic conference at King Abdullah University of Science and Technology held by Raul Tempone, Professor of Applied Mathematics and Computational Sciences at CEMSE (Computer, Electrical and Mathematical Sciences & Engineering Division). Tempone’s interests in the mathematical foundation of computational science and engineering are reflected in this summer school. The school’s goal is to provide participants with an overview on the most recent research progress in the field of uncertainty quantification, with emphasis to • Multi-Level and Multi-Index
Workshop on Statistical Process Monitoring and Risk Assessment for Engineering and Spatial Environmental Applications Mar 13, 09:00 - Mar 15, 15:00 B1 L4 R4102 Environmental Statistics spatial statistics LIST OF SPEAKERS KAUST Environmental Statistics Group, CEMSE Division - Ying Sun (ying.sun@kaust.edu.sa), PI - Fouzi Harrou (fouzi.harrou@kaust.edu.sa), Postdoc - Huang Huang (huang.huang@kaust.edu.sa), PhD Student - Tianbo Chen (tianbo.chen@kaust.edu.sa), PhD Student - Rui Meng (rui.meng@kaust.edu.sa), Master Student - Sulaiman Binkhamis (sulaiman.binkhamis@kaust.edu.sa), Master Student Hydrology and Land Observation Group, BESE Division - Gaohong Yin (gaohong.yin@kaust.edu.sa), Master Student Collaborators - NorEddine Ghaffour (noreddine.ghaffour@kaust.edu.sa), co-I, WDRC - Matthew McCabe
A Distributed Implementation of the Multi-resolution Approximation for Very Large Spatial Data Dorit Hammerling, National Center for Atmospheric Research (NCAR) Feb 10, 15:30 - 16:30 B1 L4 R4102 spatial statistics With data of rapidly increasing sizes in the environmental and geosciences such as satellite observations and high-resolution climate model runs, the spatial statistics community has recently focused on methods that are applicable to very large data. One such state-of-the-art method is the multi-resolution approximation (MRA), which was specifically developed with high performance computer architecture in mind.
Disease Risk Estimation by Combining Case-Control Data with Aggregated Information on the Population at Risk Xiaohui Chang, Assistant Professor, College of Business at Oregon State University Nov 9, 15:30 - 16:00 B1 L4 R4102 statistics We propose a novel statistical framework by supplementing case–control data with summary statistics on the population at risk for a subset of risk factors. Our approach is to first form two unbiased estimating equations, one based on the case–control data and the other on both the case data and the summary statistics, and then optimally combine them to derive another estimating equation to be used for the estimation.
Kriging Asymptotics William Kleiber, Assistant Professor, University of Colorado Nov 9, 15:00 - 16:30 B1 L4 R4102 spatial statistics Spatial analyses often focus on spatial smoothing using the geostatistical technique known as kriging. Theoretical results regarding large sample convergence rates of kriging predictors remain elusive. By casting kriging as a variational problem, we develop an equivalent kernel approximation technique that can also lead to computational feasibility for large data problems.
Workshop on Computational Space-Time Statistics Oct 4, 09:45 - Oct 6, 10:45 B1 L4 R4102 Statistics of extremes Environmental Statistics Workshop on Computational Space-Time Statistics
Collective Estimation of Multiple Bivariate Density Functions with Application to Angular-sampling-based Protein Structure Prediction Mehdi Moodaaliat, Assistant Professor, Marquette University Mar 10, 15:00 - 16:00 B1 statistics In this talk we develop a method for simultaneous estimation of density functions for a collection of populations of protein backbone angle pairs. Each log density function in the collection is modeled as a linear combination of a common set of basis functions. The shared basis functions are modeled as bivariate splines on triangulations and are estimated using data. The circular nature of angular data is taken into account by imposing appropriate smoothness constraints across boundaries of the triangles.
Bayesian Regression Trees, Nonparametric Heteroscedastic Regression Modeling and MCMC Sampling Matthew Pratola, Assistant Professor of Statistics, The Ohio State University Nov 24, 15:00 - 16:00 B1 L2 nonparametric statistics Bayesian Statistics In this talk, we introduce a new Bayesian regression tree model that allows for possible heteroscedasticity in the variance model and devise novel MCMC samplers that appear to adequately explore the posterior tree space of this model.
Uncertainty Quantification of Tsunami Models Serge Guillas, Professor of Statistics, University College London (UCL) Sep 8, 15:00 - 16:00 B1 uncertainty quantification Environmental Statistics In this talk, we first show various strategies for the efficient emulation of simulators having uncertain inputs, with applications to tsunami wave modelling. A fast surrogate of the simulator's time series of outputs is provided by the outer product emulator.
Parametric Problems, Stochastic, and Identification By Prof. Hermann Matthies (ISCTUB, Germany) Prof. Hermann Matthies, Institute of Scientific Computing TU Braunschweig, Geramany Mar 6, 15:00 - 16:00 B1 R4102 Parameter identification problems are formulated in a probabilistic language, where the randomness reflects the uncertainty about the knowledge of the true values. This setting allows conceptually easily incorporating new information, e. g. through a measurement, by connecting it to Bayes's theorem. The unknown quantity is modelled as a (may be high-dimensional) random variable. Such a description has two constituents, the measurable function and the measure.
Scalable Hierarchical Algorithms for eXtreme Computing Workshop David Keyes, Senior Associate to the President, King Abdullah University of Science and Technology Apr 28, 08:00 - Apr 30, 16:00 KAUST scientific computing The 2012 SHAX-C workshop focuses international expert attention on the prospects for the three great hierarchical algorithms of scientific computing: multigrid, fast transforms, and fast multipole methods. These methods are kernels in simulations based on formulations of partial differential equations, integral equations, and interacting particles – in short, they are scientific and engineering workhorses.