In this paper, we present Tac2Pose, an object-specific approach to tactile pose estimation from the first touch for known objects. Given the object geometry, we learn a tailored perception model in simulation that estimates a probability distribution over possible object poses given a tactile observation. To do so, we simulate the contact shapes that a dense set of object poses would produce on the sensor. Then, given a new contact shape obtained from the sensor, we match it against the pre-computed set using an object-specific embedding learned using contrastive learning. We obtain contact shapes from the sensor with an object-agnostic calibration step that maps RGB tactile observations to binary contact shapes. This mapping, which can be reused across object and sensor instances, is the only step trained with real sensor data. This results in a perception model that localizes objects from the first real tactile observation. Importantly, it produces pose distributions and can incorporate additional pose constraints coming from other perception systems, contacts, or priors. We provide quantitative results for 20 objects. Tac2Pose provides high accuracy pose estimations from distinctive tactile observations while regressing meaningful pose distributions to account for those contact shapes that could result from different object poses. We also test Tac2Pose on object models reconstructed from a 3D scanner, to evaluate the robustness to uncertainty in the object model. Finally, we demonstrate the advantages of Tac2Pose compared with three baseline methods for tactile pose estimation: directly regressing the object pose with a neural network, matching an observed contact to a set of possible contacts using a standard classification neural network, and direct pixel comparison of an observed contact with a set of possible contacts. Website: http://mcube.mit.edu/research/tac2pose.html
翻译:本文提出Tac2Pose,一种针对已知物体基于首次触觉观测进行姿态估计的物体特定方法。给定物体几何信息,我们在仿真中学习一个定制化的感知模型,该模型根据触觉观测估计物体可能姿态的概率分布。为此,我们模拟密集物体姿态集合在传感器上产生的接触形状。随后,针对从传感器获取的新接触形状,我们利用通过对比学习训练的物体特定嵌入,将其与预先计算的接触形状集合进行匹配。我们通过一个与物体无关的标定步骤从传感器获取接触形状,该步骤将RGB触觉观测映射为二值接触形状。这一映射可跨物体和传感器实例复用,且是唯一使用真实传感器数据训练的部分。由此得到一个能从首次真实触觉观测定位物体的感知模型。重要的是,该模型能输出姿态分布,并整合来自其他感知系统、接触信息或先验知识的额外姿态约束。我们对20个物体进行了定量评估。Tac2Pose能从辨识性触觉观测中提供高精度姿态估计,同时回归有意义的姿态分布以解释可能由不同物体姿态产生的接触形状。我们还使用3D扫描仪重建的物体模型测试Tac2Pose,评估其对物体模型不确定性的鲁棒性。最后,我们展示了Tac2Pose相比三种触觉姿态估计基线方法的优势:直接使用神经网络回归物体姿态、使用标准分类神经网络将观测接触与候选接触集合匹配、以及将观测接触与候选接触集合进行直接像素比较。网站:http://mcube.mit.edu/research/tac2pose.html