Abstract:
The 3D data collected using state-of-art algorithms often suffers from various problems, such as incompletion and inaccuracy. Using temporal information has been proven effective for improving the reconstruction quality; for example, KinectFusion shows significant improvements for static scenes. In this work, we present a system that uses commodity depth and color cameras, such as Microsoft Kinects, to fuse the 3D data captured over time for dynamic objects to build a complete and accurate model, and then tracks the model to match later observations. The key ingredients of our system include a nonrigid matching algorithm that aligns 3D observations of dynamic objects by using both geometry and texture measurements, and a volumetric fusion algorithm that fuses noisy 3D data. We demonstrate that the quality of the model improves dramatically by fusing a sequence of noisy and incomplete depth data of human and that by deforming this fused model to later observations, noise-and-hole-free 3D models are generated for the human moving freely.
Social Program