Tuesday, November 14, 2023, 12:15
- 14:15
Building 1, Level 3, Room 3426
Contact Person
The development of advanced vision-language models necessitates considerable resources, both in terms of computation and data. There is growing interest in training these models efficiently and effectively and leveraging them for various downstream tasks. This dissertation presents several contributions aimed at improving both learning and data efficiency in vision-language learning, and how to leverage them into downstream tasks.
Sunday, November 12, 2023, 15:00
- 16:30
B1, L4, R4214
Contact Person
Sequential modeling algorithms have made significant strides in a variety of domains, facilitating intelligent decision-making and planning in complex scenarios. This dissertation explores the potential and limitations of these algorithms, unveiling novel approaches to enhance their performance across diverse fields, from autonomous driving and trajectory forecasting to reinforcement learning and vision language understanding.
The 2nd SAAI Factory Hackathon Kickoff Symposium 2023
Tuesday, May 02, 2023, 09:00
- 17:00
Building 20, Auditorium
Contact Person

We are pleased to invite you to the second SAAI (Super Artistic AI) Factory Hackathon 2023, a program chaire

Wednesday, February 15, 2023, 20:10
- 22:00
B1, L2, R2202
In computer vision, generative AI models are typically built for images, videos, and 3D objects. Recently, there has emerged a paradigm of neural fields, which unifies the representations of such types of data by parametrizing them via neural networks. In this thesis, we develop generative models for images, videos, and 3D scenes which treat the underlying data in such a form and explore the benefits which such a perspective provides.
Monday, May 16, 2022, 12:00
- 13:00
Building 9, Room 2322, Hall 1
Datasets that capture the connection between vision, language, and affection are limited, causing a lack of understanding of the emotional aspect of human intelligence. As a step in this direction, the ArtEmis dataset was recently introduced as a large-scale dataset of emotional reactions to images along with language explanations of these chosen emotions.
Monday, March 08, 2021, 12:00
- 13:00
KAUST
We present a novel large-scale dataset and accompanying machine learning models aimed at providing a detailed understanding of the interplay between visual content, its emotional effect, and explanations for the latter in language. In contrast to most existing annotation datasets in computer vision, we focus on the affective experience triggered by visual artworks and ask the annotators to indicate the dominant emotion they feel for a given image and, crucially, to also provide a grounded verbal explanation for their emotion choice.