Computer Science Graduate Seminar
Monday, August 16, 2021, 2:00pm
3D Scene Understanding on Point Clouds
- Francis Engelmann, M.Sc. – Chair for Computer Science 13
- Zoom: https://rwth.zoom.us/j/97522924662?pwd=NWN1Z2lkWGJOTkt0N2QyTHBRdSs1Zz09
Meeting ID: 975 2292 4662
In this talk I present my thesis contributions to the emerging field of 3D scene understanding. That is, given a 3D scene representation as input, we address tasks such as 3D object detection, shape reconstruction and pose estimation, as well as 3D semantic- and instance-segmentation. The recent availability of inexpensive depth sensors has made 3D data widely accessible. At the same time, current aspirations in the field of robotics, augmented reality and self-driving cars require efficient and reliable algorithms for understanding different 3D scene representations, such as polygon meshes, point clouds or volumetric structures. While 3D data overcomes inherent limitations of projected 2D views, such as occlusions, scale-ambiguity and lack of geometry, it also introduces new challenges including sparsity and non-uniform sampling. Therefore, existing methods for 2D image processing do not generalize well to 3D data structures. In this talk, we present novel approaches specific to 3D scene understanding.
The main contributions are organized into three parts:
The core contribution of the first part is a probabilistic formulation which integrates 3D shape and motion priors as well as stereo depth measurements into a global optimization problem. The resulting approach can jointly estimate the 3D shape, pose and motion of multiple vehicles in urban street scenes.
The second part deals with new deep learning models for processing 3D point clouds. In particular, we propose sequential and recurrent consolidation units for increasing the spatial context of point networks, and a simple yet efficient dilation mechanism for increasing the receptive field size of deep point convolutional networks.
Finally, in the third part, we introduce advanced deep learning models. For semantic segmentation, we present the combination of two types of convolutions operating jointly on point clouds and mesh surfaces. In instance segmentation, we propose a new paradigm combining bottom-up and top-down approaches as introduced in previous works.
The talk concludes with a discussion and directions for future research.
The computer science lecturers invite interested people to join.