In cluttered environments where visual sensors encounter heavy occlusion, such as in agricultural settings, tactile signals can provide crucial spatial information for the robot to locate rigid objects and maneuver around them. We introduce SonicBoom, a holistic hardware and learning pipeline that enables contact localization through an array of contact microphones. While conventional sound source localization methods effectively triangulate sources in air, localization through solid media with irregular geometry and structure presents challenges that are difficult to model analytically. We address this challenge through a feature engineering and learning based approach, autonomously collecting 18,000 robot interaction sound pairs to learn a mapping between acoustic signals and collision locations on the robot end effector link. By leveraging relative features between microphones, SonicBoom achieves localization errors of 0.42cm for in distribution interactions and maintains robust performance of 2.22cm error even with novel objects and contact conditions. We demonstrate the system's practical utility through haptic mapping of occluded branches in mock canopy settings, showing that acoustic based sensing can enable reliable robot navigation in visually challenging environments.
翻译:在视觉传感器遭遇严重遮挡的杂乱环境中(例如农业场景),触觉信号可为机器人提供关键的空间信息,用于定位刚性物体并规避障碍。本文提出SonicBoom——一套完整的硬件与学习框架,通过接触式麦克风阵列实现接触定位。传统声源定位方法能有效对空气中的声源进行三角测量,但通过几何形状与结构不规则的固体介质进行定位,存在难以通过解析建模的挑战。我们通过特征工程与基于学习的方法应对这一挑战,自动采集18,000组机器人交互声音对,学习声学信号与机器人末端执行器连杆上碰撞位置之间的映射关系。通过利用麦克风间的相对特征,SonicBoom在分布内交互中实现了0.42厘米的定位误差,即使面对新物体和接触条件仍保持2.22厘米误差的稳健性能。我们在模拟冠层环境中通过对遮挡枝条的触觉建图,展示了该系统的实用价值,证明基于声学的感知能在视觉挑战性环境中实现可靠的机器人导航。