Geometry and Semantics in Computer Vision


Roland Angst

(Stanford University)

Please LOG IN to view the video.

Date: 5/28/2013


Recent developments in computer vision have led to well-established pipelines for fully automated image- or video-based scene reconstructions. While we have seen progress in 2D scene understanding, 3D reconstructions and scene understanding have evolved independently. Hence, a major trend in computer vision is currently the development of more holistic views which combine scene understanding and 3D reconstructions in a joint, more robust and accurate framework (3D scene understanding).
In my talk, I will present two recent, but entirely different approaches how to build upon such geometric concepts in order to extract scene semantics. Using a structure-from-motion pipeline and a low-rank factorization technique, the first approach analyzes the motion of rigid parts in order to extract motion constraints between those rigid parts. Such motion constraints reveal valuable information about the functionality of an object, eg. a rolling motion parallel to a plane is likely due to a wheel. The second approach combines appearance-based superpixel classifiers with class-specific 3D shape priors for joint 3D scene reconstruction and class segmentation. The appearance based classifiers guide the selection of an appropriate 3D shape regularization term whereas the 3D reconstruction in turn helps in the class segmentation task. We will see how this can be formulated as a convex optimization problem over labels of a volumetric voxel grid.

Created: Tuesday, May 28th, 2013