Human demonstrations collected by wearable devices (e.g., tactile gloves) provide fast and dexterous supervision for policy learning, and are guided by rich, natural tactile feedback. However, a key challenge is how to transfer human-collected tactile signals to robots despite the differences in sensing modalities and embodiment. Existing human-to-robot (H2R) approaches that incorporate touch often assume identical tactile sensors, require paired data, and involve little to no embodiment gap between human demonstrator and the robots, limiting scalability and generality. We propose TactAlign, a cross-embodiment tactile alignment method that transfers human-collected tactile signals to a robot with different embodiment. TactAlign transforms human and robot tactile observations into a shared latent representation using a rectified flow, without paired datasets, manual labels, or privileged information. Our method enables low-cost latent transport guided by hand-object interaction-derived pseudo-pairs. We demonstrate that TactAlign improves H2R policy transfer across multiple contact-rich tasks (pivoting, insertion, lid closing), generalizes to unseen objects and tasks with human data (less than 5 minutes), and enables zero-shot H2R transfer on a highly dexterous tasks (light bulb screwing).
翻译:通过可穿戴设备(如触觉手套)采集的人类演示为策略学习提供了快速灵巧的监督,并受到丰富自然的触觉反馈引导。然而,一个关键挑战在于如何将人类采集的触觉信号迁移至机器人,尽管两者在感知模态和具身形态上存在差异。现有结合触觉的人机迁移方法通常假设使用相同的触觉传感器、需要配对数据,且人类演示者与机器人之间的具身差异极小甚至不存在,这限制了方法的可扩展性与通用性。我们提出TactAlign,一种跨具身的触觉对齐方法,能够将人类采集的触觉信号迁移至具有不同具身形态的机器人。TactAlign通过修正流将人类和机器人的触觉观测映射到共享的潜在表征空间,且无需配对数据集、人工标注或特权信息。我们的方法能够以手-物交互衍生的伪配对为指导,实现低成本的潜在迁移。实验表明,TactAlign在多种接触密集型任务(旋转、插入、盖合)中提升了人机策略迁移效果,能够利用少量人类数据(少于5分钟)泛化至未见过的物体和任务,并在高度灵巧的任务(灯泡旋拧)上实现了零样本人机迁移。