Aiming to replicate human-like dexterity, perceptual experiences, and motion patterns, we explore learning from human demonstrations using a bimanual system with multifingered hands and visuotactile data. Two significant challenges exist: the lack of an affordable and accessible teleoperation system suitable for a dual-arm setup with multifingered hands, and the scarcity of multifingered hand hardware equipped with touch sensing. To tackle the first challenge, we develop HATO, a low-cost hands-arms teleoperation system that leverages off-the-shelf electronics, complemented with a software suite that enables efficient data collection; the comprehensive software suite also supports multimodal data processing, scalable policy learning, and smooth policy deployment. To tackle the latter challenge, we introduce a novel hardware adaptation by repurposing two prosthetic hands equipped with touch sensors for research. Using visuotactile data collected from our system, we learn skills to complete long-horizon, high-precision tasks which are difficult to achieve without multifingered dexterity and touch feedback. Furthermore, we empirically investigate the effects of dataset size, sensing modality, and visual input preprocessing on policy learning. Our results mark a promising step forward in bimanual multifingered manipulation from visuotactile data. Videos, code, and datasets can be found at https://toruowo.github.io/hato/ .
翻译:为复现类人灵巧性、感知体验与运动模式,我们探索利用配备多指手与触视数据的双工系统从人类演示中学习。存在两大挑战:一是缺乏适用于多指手双机械臂配置的经济易用遥操作系统,二是配备触觉传感的多指手硬件稀缺。针对第一项挑战,我们开发了HATO——一种低成本手-臂遥操作系统,该系统利用商用电子元件并配备软件套件以实现高效数据采集;该综合软件套件还支持多模态数据处理、可扩展策略学习与平滑策略部署。针对第二项挑战,我们通过改造两只配备触觉传感器的义肢手用于研究,提出了一种新颖的硬件适配方案。利用从本系统采集的触视数据,我们学习了完成长时程、高精度任务的技能,这些任务在缺乏多指灵巧性与触觉反馈时难以实现。此外,我们通过实验研究了数据集规模、传感模态及视觉输入预处理对策略学习的影响。本研究成果标志着基于触视数据的双工多指操作向前迈出了重要一步。视频、代码及数据集详见 https://toruowo.github.io/hato/ 。