Robots cannot yet match humans' ability to rapidly learn the shapes of novel 3D objects and recognize them robustly despite clutter and occlusion. We present Bayes3D, an uncertainty-aware perception system for structured 3D scenes, that reports accurate posterior uncertainty over 3D object shape, pose, and scene composition in the presence of clutter and occlusion. Bayes3D delivers these capabilities via a novel hierarchical Bayesian model for 3D scenes and a GPU-accelerated coarse-to-fine sequential Monte Carlo algorithm. Quantitative experiments show that Bayes3D can learn 3D models of novel objects from just a handful of views, recognizing them more robustly and with orders of magnitude less training data than neural baselines, and tracking 3D objects faster than real time on a single GPU. We also demonstrate that Bayes3D learns complex 3D object models and accurately infers 3D scene composition when used on a Panda robot in a tabletop scenario.
翻译:摘要:机器人尚无法媲美人类快速学习新颖三维物体形状,并在杂乱与遮挡环境下稳健识别此类物体的能力。本论文提出Bayes3D——一种面向结构化三维场景的不确定性感知感知系统,能够在存在杂乱及遮挡的情况下,报告关于三维物体形状、姿态及场景构成的准确后验不确定性。Bayes3D通过一种新颖的三维场景层次化贝叶斯模型及GPU加速的由粗到精细序贯蒙特卡洛算法实现上述能力。定量实验表明,Bayes3D仅需少量视角即可学习新颖物体的三维模型,其识别稳健性远超神经网络基线方法,且所需训练数据量降低数个数量级,同时在单GPU上可实现超实时速度的三维物体追踪。我们还在桌面场景下的Panda机器人上验证了Bayes3D能够学习复杂三维物体模型并准确推断三维场景构成。