This work presents an instance-agnostic learning framework that fuses vision with dynamics to simultaneously learn shape, pose trajectories and physical properties via the use of geometry as a shared representation. Unlike many contact learning approaches that assume motion capture input and a known shape prior for the collision model, our proposed framework learns an object's geometric and dynamic properties from RGBD video, without requiring either category-level or instance-level shape priors. We integrate a vision system, BundleSDF, with a dynamics system, ContactNets and propose a cyclic training pipeline to use the output from the dynamics module to refine the poses and the geometry from the vision module, using perspective reprojection. Experiments demonstrate our framework's ability to learn the geometry and dynamics of rigid and convex objects and improve upon the current tracking framework.
翻译:本文提出一种实例无关的学习框架,该框架融合视觉与动力学,利用几何作为共享表示,同时学习形状、位姿轨迹及物理属性。与许多需要运动捕捉输入和已知碰撞模型形状先验的接触学习方法不同,我们提出的框架从RGBD视频中学习物体的几何与动态属性,无需类别级或实例级的形状先验。我们整合了视觉系统BundleSDF与动力学系统ContactNets,并提出一种循环训练流程,利用动力学模块的输出通过透视重投影来优化视觉模块的位姿与几何。实验表明,该框架能够有效学习刚性与凸面物体的几何与动力学特性,并改进现有的跟踪框架性能。