Key Info

Basic Information

Portrait: Prof. Dr. Bastian Leibe © Peter Winandy
Prof. Dr. Bastian Leibe
Faculty / Institution:
Mathematics, Computer Science and Natural Sciences
Organizational Unit:
Visual Computing Institute
Excellent Science
Project duration:
01.04.2018 to 31.03.2023
EU contribution:
2.000.000 euros
  EU flag and ERC logo This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (Grant agreement No. 773161)  


Deep Learning for Dynamic 3D Visual Scene Understanding


Over the past 5 years, deep learning has exercised a tremendous and transformational effect on the field of computer vision. However, deep neural networks (DNNs) can only realize their full potential when applied in an end-to-end manner, i.e., when every stage of the processing pipeline is differentiable with respect to the network’s parameters, such that all of those parameters can be optimized together. Such end-to-end learning solutions are still rare for computer vision problems, in particular for dynamic visual scene understanding tasks. Moreover, feed-forward processing, as done in most DNN-based vision approaches, is only a tiny fraction of what the human brain can do. Feedback processes, temporal information processing, and memory mechanisms form an important part of our human scene understanding capabilities. Those mechanisms are currently underexplored in computer vision.
The goal of this proposal is to remove this bottleneck and to design end-to-end deep learning approaches that can realize the full potential of DNNs for dynamic visual scene understanding. We will make use of the positive interactions and feedback processes between multiple vision modalities and combine them to work towards a common goal. In addition, we will impart deep learning approaches with a notion of what it means to move through a 3D world by incorporating temporal continuity constraints, as well as by developing novel deep associative and spatial memory mechanisms.

The results of this research will enable deep neural networks to reach significantly improved dynamic scene understanding capabilities compared to today’s methods. This will have an immediate positive effect for applications in need for such capabilities, most notably for mobile robotics and intelligent vehicles.