Tactile recognition of 3D objects remains a challenging task. Compared to 2D shapes, the complex geometry of 3D surfaces requires richer tactile signals, more dexterous actions, and more advanced encoding techniques. In this work, we propose TANDEM3D, a method that applies a co-training framework for exploration and decision making to 3D object recognition with tactile signals. Starting with our previous work, which introduced a co-training paradigm for 2D recognition problems, we introduce a number of advances that enable us to scale up to 3D. TANDEM3D is based on a novel encoder that builds 3D object representation from contact positions and normals using PointNet++. Furthermore, by enabling 6DOF movement, TANDEM3D explores and collects discriminative touch information with high efficiency. Our method is trained entirely in simulation and validated with real-world experiments. Compared to state-of-the-art baselines, TANDEM3D achieves higher accuracy and a lower number of actions in recognizing 3D objects and is also shown to be more robust to different types and amounts of sensor noise. Video is available at https://jxu.ai/tandem3d.
翻译:三维物体的触觉识别仍是一项具有挑战性的任务。与二维形状相比,三维表面的复杂几何结构需要更丰富的触觉信号、更灵巧的动作以及更先进的编码技术。在这项工作中,我们提出了TANDEM3D,一种将探索与决策的协同训练框架应用于三维物体触觉识别的方法。基于我们先前针对二维识别问题提出的协同训练范式,我们引入了一系列改进,从而能够将方法扩展至三维领域。TANDEM3D基于一种新型编码器,该编码器利用PointNet++从接触位置和法向量构建三维物体表示。此外,通过支持六自由度运动,TANDEM3D能够高效探索并收集具有判别性的触觉信息。我们的方法完全在仿真环境中训练,并通过真实世界实验进行验证。与最先进的基线方法相比,TANDEM3D在识别三维物体时实现了更高的准确率和更少的动作次数,同时被证明对不同类型和数量的传感器噪声具有更强的鲁棒性。视频可访问 https://jxu.ai/tandem3d 获取。