Video-based Reconstruction of the Real World in Motion

Christian Theobalt

(Max-Planck-Institute (MPI) for Informatics)

Please LOG IN to view the video.

Date: March 21, 2018


New methods for capturing highly detailed models of moving real world scenes with cameras, i.e., models of detailed deforming geometry, appearance or even material properties, become more and more important in many application areas. They are needed in visual content creation, for instance in visual effects, where they are needed to build highly realistic models of virtual human actors. Further on, efficient, reliable and highly accurate dynamic scene reconstruction is nowadays an important prerequisite for many other application domains, such as: human-computer and human-robot interaction, autonomous robotics and autonomous driving, virtual and augmented reality, 3D and free-viewpoint TV, immersive telepresence, and even video editing.

The development of dynamic scene reconstruction methods has been a long standing challenge in computer graphics and computer vision. Recently, the field has seen important progress. New methods were developed that capture – without markers or scene instrumentation – rather detailed models of individual moving humans or general deforming surfaces from video recordings, and capture even simple models of appearance and lighting. However, despite this recent progress, the field is still at an early stage, and current technology is still starkly constrained in many ways. Many of today’s state-of-the-art methods are still niche solutions that are designed to work under very constrained conditions, for instance: only in controlled studios, with many cameras, for very specific object types, for very simple types of motion and deformation, or at processing speeds far from real-time.

In this talk, I will present some of our recent works on detailed marker-less dynamic scene reconstruction and performance capture in which we advanced the state of the art in several ways. For instance, I will briefly show new methods for marker-less capture of the full body (like our VNECT approach) and hands that work in more general environments, and even in real-time and with one camera. I will then show some of our work on high-quality face performance capture and face reenactment. Here, I will also illustrate the benefits of both model-based and learning-based approaches and show how different ways to join the forces of the two open up new possibilities. Live demos included !

More Information:

Created: Thursday, March 22nd, 2018