The versatility and adaptability of human grasping catalyze advancing dexterous robotic manipulation. While significant strides have been made in dexterous grasp generation, current research endeavors pivot towards optimizing object manipulation while ensuring functional integrity, emphasizing the synthesis of functional grasps following desired affordance instructions. This paper addresses the challenge of synthesizing functional grasps tailored to diverse dexterous robotic hands by proposing DexGrasp-Diffusion, an end-to-end modularized diffusion-based pipeline. DexGrasp-Diffusion integrates MultiHandDiffuser, a novel unified data-driven diffusion model for multi-dexterous hands grasp estimation, with DexDiscriminator, which employs a Physics Discriminator and a Functional Discriminator with open-vocabulary setting to filter physically plausible functional grasps based on object affordances. The experimental evaluation conducted on the MultiDex dataset provides substantiating evidence supporting the superior performance of MultiHandDiffuser over the baseline model in terms of success rate, grasp diversity, and collision depth. Moreover, we demonstrate the capacity of DexGrasp-Diffusion to reliably generate functional grasps for household objects aligned with specific affordance instructions.
翻译:人类抓取的多样性与适应性推动了灵巧机器人操控技术的发展。尽管在灵巧抓取生成方面已取得显著进展,当前研究重点正转向优化物体操控并确保功能完整性,强调依据预期功能可供性指令合成功能性抓取。本文针对为多样化灵巧机器人手合成定制化功能性抓取的挑战,提出了DexGrasp-Diffusion——一种端到端模块化的基于扩散的框架。DexGrasp-Diffusion整合了MultiHandDiffuser(一种新颖的、数据驱动的统一扩散模型,用于多灵巧手抓取估计)与DexDiscriminator(采用物理判别器和开放词汇设置的功能判别器,基于物体可供性筛选物理合理的功能性抓取)。在MultiDex数据集上进行的实验评估提供了实证证据,表明MultiHandDiffuser在成功率、抓取多样性和碰撞深度方面均优于基线模型。此外,我们展示了DexGrasp-Diffusion能够可靠地为家居物体生成符合特定功能可供性指令的功能性抓取。