Humans rely on their visual and tactile senses to develop a comprehensive 3D understanding of their physical environment. Recently, there has been a growing interest in exploring and manipulating objects using data-driven approaches that utilise high-resolution vision-based tactile sensors. However, 3D shape reconstruction using tactile sensing has lagged behind visual shape reconstruction because of limitations in existing techniques, including the inability to generalise over unseen shapes, the absence of real-world testing, and limited expressive capacity imposed by discrete representations. To address these challenges, we propose TouchSDF, a Deep Learning approach for tactile 3D shape reconstruction that leverages the rich information provided by a vision-based tactile sensor and the expressivity of the implicit neural representation DeepSDF. Our technique consists of two components: (1) a Convolutional Neural Network that maps tactile images into local meshes representing the surface at the touch location, and (2) an implicit neural function that predicts a signed distance function to extract the desired 3D shape. This combination allows TouchSDF to reconstruct smooth and continuous 3D shapes from tactile inputs in simulation and real-world settings, opening up research avenues for robust 3D-aware representations and improved multimodal perception in robotics. Code and supplementary material are available at: https://touchsdf.github.io/
翻译:人类依赖视觉与触觉感知来构建物理环境的完整三维理解。近年来,利用高分辨率视觉触觉传感器的数据驱动方法探索和操作物体引起广泛关注。然而,由于现有技术的局限性(包括无法泛化未见形状、缺乏真实世界测试以及离散表示导致的表达能力受限),基于触觉感应的三维形状重建仍落后于视觉形状重建。为解决这些挑战,我们提出TouchSDF——一种基于深度学习的触觉三维形状重建方法,该方法充分利用视觉触觉传感器提供的丰富信息与隐式神经表示DeepSDF的表达能力。我们的技术包含两个组件:(1)卷积神经网络,将触觉图像映射为表征接触表面局部网格;(2)隐式神经函数,预测符号距离函数以提取所需三维形状。这种结合使TouchSDF能够从仿真及真实环境的触觉输入中重建平滑连续的三维形状,为鲁棒的三维感知表示与机器人多模态感知改进开辟研究路径。代码及补充材料见:https://touchsdf.github.io/