Robotic grasping in clutters is a fundamental task in robotic manipulation. In this work, we propose an economic framework for 6-DoF grasp detection, aiming to economize the resource cost in training and meanwhile maintain effective grasp performance. To begin with, we discover that the dense supervision is the bottleneck of current SOTA methods that severely encumbers the entire training overload, meanwhile making the training difficult to converge. To solve the above problem, we first propose an economic supervision paradigm for efficient and effective grasping. This paradigm includes a well-designed supervision selection strategy, selecting key labels basically without ambiguity, and an economic pipeline to enable the training after selection. Furthermore, benefit from the economic supervision, we can focus on a specific grasp, and thus we devise a focal representation module, which comprises an interactive grasp head and a composite score estimation to generate the specific grasp more accurately. Combining all together, the EconomicGrasp framework is proposed. Our extensive experiments show that EconomicGrasp surpasses the SOTA grasp method by about 3AP on average, and with extremely low resource cost, for about 1/4 training time cost, 1/8 memory cost and 1/30 storage cost. Our code is available at https://github.com/iSEE-Laboratory/EconomicGrasp.
翻译:在杂乱环境中的机器人抓取是机器人操作的一项基本任务。在本工作中,我们提出了一种面向6自由度抓取检测的经济型框架,旨在节省训练过程中的资源成本,同时保持有效的抓取性能。首先,我们发现密集监督是当前SOTA方法的瓶颈,它严重阻碍了整个训练过程,同时使得训练难以收敛。为解决上述问题,我们首先提出了一种用于高效且有效抓取的经济型监督范式。该范式包含一个精心设计的监督选择策略——基本选择无歧义的关键标签,以及一个经济型流程以实现选择后的训练。此外,得益于经济型监督,我们可以专注于特定抓取,因此我们设计了一个焦点表示模块,该模块包含一个交互式抓取头和一个复合分数估计器,以更准确地生成特定抓取。综合以上,我们提出了EconomicGrasp框架。我们的大量实验表明,EconomicGrasp平均超越SOTA抓取方法约3AP,且资源成本极低——训练时间成本约为1/4,内存成本约为1/8,存储成本约为1/30。我们的代码可在https://github.com/iSEE-Laboratory/EconomicGrasp获取。