The ability to manipulate tools significantly expands the set of tasks a robot can perform. Yet, tool manipulation represents a challenging class of dexterity, requiring grasping thin objects, in-hand object rotations, and forceful interactions. Since collecting teleoperation data for these behaviors is challenging, sim-to-real reinforcement learning (RL) is a promising alternative. However, prior approaches typically require substantial engineering effort to model objects and tune reward functions for each task. In this work, we propose SimToolReal, taking a step towards generalizing sim-to-real RL policies for tool manipulation. Instead of focusing on a single object and task, we procedurally generate a large variety of tool-like object primitives in simulation and train a single RL policy with the universal goal of manipulating each object to random goal poses. This approach enables SimToolReal to perform general dexterous tool manipulation at test-time without any object or task-specific training. We demonstrate that SimToolReal outperforms prior retargeting and fixed-grasp methods by 37% while matching the performance of specialist RL policies trained on specific target objects and tasks. Finally, we show that SimToolReal generalizes across a diverse set of everyday tools, achieving strong zero-shot performance over 120 real-world rollouts spanning 24 tasks, 12 object instances, and 6 tool categories.
翻译:工具操作能力显著扩展了机器人可执行的任务范围。然而,工具操作代表了一类具有挑战性的灵巧性任务,需要抓握细长物体、进行手内物体旋转以及施加有力的交互。由于收集此类行为的遥操作数据较为困难,仿真到现实强化学习成为一种有前景的替代方案。然而,先前方法通常需要大量工程工作来为每个任务建模物体并调整奖励函数。本工作中,我们提出SimToolReal,朝着泛化工具操作的仿真到现实强化学习策略迈进一步。我们不专注于单一物体和任务,而是在仿真中程序化生成大量多样化的类工具物体基元,并训练一个单一强化学习策略,其通用目标是将每个物体操控至随机目标姿态。该方法使SimToolReal在测试时无需任何物体或任务特定训练即可执行通用的灵巧工具操作。我们证明SimToolReal在性能上超越先前重定向和固定抓握方法37%,同时与在特定目标物体和任务上训练的专用强化学习策略表现相当。最后,我们展示了SimToolReal在多样化日常工具上的泛化能力,在涵盖24个任务、12个物体实例和6个工具类别的120次现实世界部署中实现了强大的零样本性能。