Quality-Diversity (QD) methods are algorithms that aim to generate a set of diverse and high-performing solutions to a given problem. Originally developed for evolutionary robotics, most QD studies are conducted on a limited set of domains - mainly applied to locomotion, where the fitness and the behavior signal are dense. Grasping is a crucial task for manipulation in robotics. Despite the efforts of many research communities, this task is yet to be solved. Grasping cumulates unprecedented challenges in QD literature: it suffers from reward sparsity, behavioral sparsity, and behavior space misalignment. The present work studies how QD can address grasping. Experiments have been conducted on 15 different methods on 10 grasping domains, corresponding to 2 different robot-gripper setups and 5 standard objects. An evaluation framework that distinguishes the evaluation of an algorithm from its internal components has also been proposed for a fair comparison. The obtained results show that MAP-Elites variants that select successful solutions in priority outperform all the compared methods on the studied metrics by a large margin. We also found experimental evidence that sparse interaction can lead to deceptive novelty. To our knowledge, the ability to efficiently produce examples of grasping trajectories demonstrated in this work has no precedent in the literature.
翻译:质量多样性(QD)方法旨在为给定问题生成一组多样化且高性能的解决方案。该类方法最初源于进化机器人学,但大多数QD研究局限于有限领域——主要应用于适应性信号和行为信号密集的移动任务。抓取是机器人操作中的关键任务,尽管众多研究团体已付出努力,该问题仍未得到解决。抓取任务为QD文献带来了前所未有的挑战:其面临奖励稀疏性、行为稀疏性以及行为空间错位等问题。本研究探讨了QD如何解决抓取问题。在10个抓取领域(对应2种不同机器人夹爪装置和5个标准物体)上,对15种不同方法进行了实验。为公平比较,还提出了一个区分算法评估与其内部组件的评估框架。结果表明,优先选择成功解决方案的MAP-Elites变体在所研究指标上以较大优势超越所有对比方法。实验证据还表明,稀疏交互可能导致欺骗性新颖性。据我们所知,本研究中高效生成抓取轨迹示例的能力在文献中尚无先例。