This research theme aims to improve visual computing techniques that understand the visual world to enable more accurate, robust, and efficient perception and control in autonomous vehicles, albeit in the air (unmanned aerial vehicles or UAVs), on the ground (cars), or underwater (autonomous underwater vehicles or AUVs). The grand challenge of this theme is to develop fully autonomous vehicles that can navigate safely and successfully in a variety of scenarios and environments. Our work includes both fundamental research on new methodology as well as the exciting application of autonomous vehicles to be targeted in collaboration with other domain experts.
The Visual Computing for Autonomous Vehicles theme encompasses the following problems:
Autonomous Vehicle Perception: This problem is a core challenge in enabling autonomous vehicle navigation. It includes the design of accurate, robust, and efficient methods, which analyze the 2D and 3D data acquired by the vehicle‘s sensors to make semantic sense of its surroundings. This problem includes the development of methods for simultaneous localization and mapping (SLAM), depth estimation, 2D or 3D object detection/segmentation, etc.
Understanding Dynamic Environments and Agents: This problem includes the analysis of the sensory and perception data to form a better understanding of the dynamic environment the vehicle is located in. It requires the development of methods that can model and predict the movement of agents and environment elements around the vehicle, for use in safe navigation.
Fully Autonomous Vehicle Control: This problem includes the development of control methods that make use of the extracted perception data and the dynamic environment modeling to robustly control the dynamics of the vehicle to achieve its planned goal (e.g. safely reach a destination while following driving rules).
Collaborative Vehicles: The challenge of fully autonomous vehicles can greatly benefit from collaboration between vehicles, something that is not possible with human drivers. This problem includes the development of collaborative methods among vehicles so that the perception, control, and planning of a single vehicle can leverage this information from other vehicles.
Sim2Real: Learning to Drive
Data-driven learning of vehicle driving policies requires a lot of training data, especially at edge cases. Acquiring and annotating such training data is near impossible in the real world. In many cases, it is too dangerous (e.g. scenarios where the moving car is in immediate proximity to pedestrians and other cars).
Photo-realistic simulators offer a rich and practically infinite source of training data, especially for these edge cases. For example, pedestrians can be hit in simulation to train the system to avoid such behavior. Also, the weather, as well as the static/dynamic layout of the driving environment, can be easily changed in simulation. However, the visual gap between such simulators and the real-world needs to be addressed. VCC professor Bernard Ghanem has worked quite a bit on learning how to transfer what is learned in simulation to the real-world to overcome this gap (called Sim2Real transfer). This can be done in several ways, especially by training the policy on intermediate representations that are shared between the simulated and real worlds (e.g. semantic segmentation instead of raw RGB images).
Here is an example of transferring a car driving policy trained in simulation (using Intel’s CARLA driving simulator) to the real-world deployed on a remote control car. This work was in collaboration with Intel Labs in Munich Germany.
Sim2Real: Learning to Aerially Track
This is another example of transferring perception and control knowledge learned in simulation to a real-world application. In this case, the application of interest relates to onboard tracking of objects from a UAV. The UAV tracking system must find where the object of interest is at the current time and predict where it will be in the near future. As compared to classical object tracking in computer vision, this tracking information is used, in turn, to control UAV navigation, such that the object remains within the field-of-view of the UAV throughout its flight. Moreover, when the UAV battery is low, it must initiate and execute the transfer of this tracking information to another UAV, so as to maintain a persistent tracking of the object. In this project, VCC professor Bernard Ghanem and his group developed their own UAV simulator using a popular computer game engine to train a vision-based object tracker in simulation before transferring it to the application of real-world aerial tracking.
Sim2Real: Learning to Race UAVs
In this project, the interesting application of UAV racing is considered. Here, the UAV flying policy is learned in simulation by employing a new type of hybrid learning coined OIL (Observational Imitation Learning). OIL is a hybrid of reinforcement and imitation learning that helps the policy to learn from multiple less-than-perfect teachers. This is a scenario that more faithfully describes real-world situations, where the teachers supervising the learning of the task can have different and/or suboptimal strategies themselves. Using OIL, the flying policy learns to reliably produce control/navigation signals for the UAV from raw video data acquired from an onboard camera. This policy outperforms all its teachers. In this case, each teacher is a simple racing policy using a simple control model.