Quality-Diversity (QD) methods are algorithms that aim to generate a set of diverse and high-performing solutions to a given problem. Originally developed for evolutionary robotics, most QD studies are conducted on a limited set of domains - mainly applied to locomotion, where the fitness and the behavior signal are dense. Grasping is a crucial task for manipulation in robotics. Despite the efforts of many research communities, this task is yet to be solved. Grasping cumulates unprecedented challenges in QD literature: it suffers from reward sparsity, behavioral sparsity, and behavior space misalignment. The present work studies how QD can address grasping. Experiments have been conducted on 15 different methods on 10 grasping domains, corresponding to 2 different robot-gripper setups and 5 standard objects. An evaluation framework that distinguishes the evaluation of an algorithm from its internal components has also been proposed for a fair comparison. The obtained results show that MAP-Elites variants that select successful solutions in priority outperform all the compared methods on the studied metrics by a large margin. We also found experimental evidence that sparse interaction can lead to deceptive novelty. To our knowledge, the ability to efficiently produce examples of grasping trajectories demonstrated in this work has no precedent in the literature.
翻译:质量多样性(QD)方法旨在生成一组多样且高性能的问题解决方案。该类方法最初源自进化机器人学,目前多数研究集中在有限领域——主要应用于运动控制任务,其适应度与行为信号均为密集信号。抓取是机器人操作中的关键任务。尽管多个研究团队持续攻关,该任务仍未得到完全解决。抓取任务在QD文献中面临前所未有的挑战:稀疏奖励、稀疏行为空间以及行为空间不匹配。本研究探讨QD方法如何应对抓取问题。我们在10个抓取领域(对应2种不同的机器人夹爪设置与5种标准物体)上对15种不同方法开展了实验。为公平比较,还提出了区分算法评估与内部组件评估的评估框架。结果表明,优先选择成功解方案的MAP-Elites变体在各项指标上显著优于所有对比方法。实验证据还发现,稀疏交互可能导致具有欺骗性的新颖性。据我们所知,本研究展现的高效生成抓取轨迹示例的能力在现有文献中尚无先例。