Universal dexterous grasping across diverse objects presents a fundamental yet formidable challenge in robot learning. Existing approaches using reinforcement learning (RL) to develop policies on extensive object datasets face critical limitations, including complex curriculum design for multi-task learning and limited generalization to unseen objects. To overcome these challenges, we introduce ResDex, a novel approach that integrates residual policy learning with a mixture-of-experts (MoE) framework. ResDex is distinguished by its use of geometry-unaware base policies that are efficiently acquired on individual objects and capable of generalizing across a wide range of unseen objects. Our MoE framework incorporates several base policies to facilitate diverse grasping styles suitable for various objects. By learning residual actions alongside weights that combine these base policies, ResDex enables efficient multi-task RL for universal dexterous grasping. ResDex achieves state-of-the-art performance on the DexGraspNet dataset comprising 3,200 objects with an 88.8% success rate. It exhibits no generalization gap with unseen objects and demonstrates superior training efficiency, mastering all tasks within only 12 hours on a single GPU.
翻译:通用灵巧抓取技术跨越多样化物体是机器人学习领域一个基础且极具挑战性的课题。现有方法通常采用强化学习在大量物体数据集上训练策略,但面临诸多关键限制,包括多任务学习中复杂的课程设计以及对未见物体泛化能力不足。为克服这些挑战,我们提出ResDex——一种将残差策略学习与专家混合框架相结合的新型方法。ResDex的显著特点在于采用几何无关的基础策略,这些策略可在单个物体上高效习得,并能够广泛泛化至各类未见物体。我们的专家混合框架整合了多个基础策略,以支持适用于不同物体的多样化抓取姿态。通过学习残差动作及组合这些基础策略的权重,ResDex实现了面向通用灵巧抓取的高效多任务强化学习。在包含3,200个物体的DexGraspNet数据集上,ResDex以88.8%的成功率达到最先进性能。该方法在未见物体上未出现泛化差距,并展现出卓越的训练效率,仅需在单GPU上训练12小时即可掌握所有任务。