BODex：基于双层优化的可扩展高效机器人灵巧抓取合成 (BODex: Scalable and Efficient Robotic Dexterous Grasp Synthesis Using Bilevel Optimization)

Robotic dexterous grasping is a key step toward human-like manipulation. To fully unleash the potential of data-driven models for dexterous grasping, a large-scale, high-quality dataset is essential. While gradient-based optimization offers a promising way for constructing such datasets, existing works suffer from limitations, such as restrictive assumptions in energy design or limited experiments on small object sets. Moreover, the lack of a standard benchmark for comparing synthesis methods and datasets hinders progress in this field. To address these challenges, we develop a highly efficient synthesis system and a comprehensive benchmark with MuJoCo for dexterous grasping. Our system formulates grasp synthesis as a bilevel optimization problem, combining a novel lower-level quadratic programming (QP) with an upper-level gradient descent process. By leveraging recent advances in CUDA-accelerated robotic libraries and GPU-based QP solvers, our system can parallelize thousands of grasps and synthesize over 49 grasps per second on a single NVIDIA 3090 GPU. Our synthesized grasps for Shadow Hand and Allegro Hand achieve a success rate above 75% in MuJoCo, with a penetration depth and contact distance of under 1 mm, outperforming existing baselines on nearly all metrics. Compared to the previous large-scale dataset, DexGraspNet, our dataset significantly improves the performance of learning models, with a simulation success rate from around 40% to 80%. Real-world testing of the trained model on the Shadow Hand achieves an 81% success rate across 20 diverse objects.

翻译：机器人灵巧抓取是实现类人操作的关键步骤。为充分释放数据驱动模型在灵巧抓取方面的潜力，大规模、高质量的数据集至关重要。虽然基于梯度的优化为构建此类数据集提供了一种有前景的方法，但现有工作存在局限性，例如能量设计中的限制性假设或仅在小规模物体集上进行有限实验。此外，缺乏用于比较合成方法和数据集的标准基准阻碍了该领域的进展。为应对这些挑战，我们开发了一个高效的合成系统，并基于MuJoCo构建了一个全面的灵巧抓取基准。我们的系统将抓取合成表述为一个双层优化问题，结合了新颖的下层二次规划（QP）与上层梯度下降过程。通过利用CUDA加速机器人库和基于GPU的QP求解器的最新进展，我们的系统能够并行处理数千个抓取，在单个NVIDIA 3090 GPU上每秒可合成超过49个抓取。我们为Shadow Hand和Allegro Hand合成的抓取在MuJoCo中实现了超过75%的成功率，穿透深度和接触距离均小于1毫米，在几乎所有指标上均优于现有基线。与先前的大规模数据集DexGraspNet相比，我们的数据集显著提升了学习模型的性能，模拟成功率从约40%提高到80%。在Shadow Hand上对训练模型进行的真实世界测试中，对20个不同物体的抓取成功率达到81%。