Hengshuang Zhao, Postdoctoral Fellow, Oxford University
Sunday, March 21, 2021, 12:00
- 13:00
KAUST
Building intelligent visual systems is essential for the next generation of artificial intelligence systems. It is a fundamental tool for many disciplines and beneficial to various potential applications such as autonomous driving, robotics, surveillance, augmented reality, to name a few. An accurate and efficient intelligent visual system has a deep understanding of the scene, objects, and humans. It can automatically understand the surrounding scenes. In general, 2D images and 3D point clouds are the two most common data representations in our daily life. Designing powerful image understanding and point cloud processing systems are two pillars of visual intelligence, enabling the artificial intelligence systems to understand and interact with the current status of the environment automatically. In this talk, I will first present our efforts in designing modern neural systems for 2D image understanding, including high-accuracy and high-efficiency semantic parsing structures, and unified panoptic parsing architecture. Then, we go one step further to design neural systems for processing complex 3D scenes, including semantic-level and instance-level understanding. Further, we show our latest works for unified 2D-3D reasoning frameworks, which are fully based on self-attention mechanisms. In the end, the challenges, up-to-date progress, and promising future directions for building advanced intelligent visual systems will be discussed.