Real-world robotic grasping can be done robustly if a complete 3D Point Cloud Data (PCD) of an object is available. However, in practice, PCDs are often incomplete when objects are viewed from few and sparse viewpoints before the grasping action, leading to the generation of wrong or inaccurate grasp poses. We propose a novel grasping strategy, named 3DSGrasp, that predicts the missing geometry from the partial PCD to produce reliable grasp poses. Our proposed PCD completion network is a Transformer-based encoder-decoder network with an Offset-Attention layer. Our network is inherently invariant to the object pose and point's permutation, which generates PCDs that are geometrically consistent and completed properly. Experiments on a wide range of partial PCD show that 3DSGrasp outperforms the best state-of-the-art method on PCD completion tasks and largely improves the grasping success rate in real-world scenarios. The code and dataset will be made available upon acceptance.
翻译:若可获得物体的完整三维点云数据,现实世界中的机器人抓取任务便可稳健执行。然而在实际应用中,抓取动作前从少量稀疏视角观测物体时,点云数据往往不完整,导致生成错误或不准确的抓取位姿。本文提出一种名为3DSGrasp的新型抓取策略,通过预测局部点云缺失的几何结构来生成可靠的抓取位姿。我们提出的点云补全网络采用基于Transformer的编码器-解码器架构,并引入偏移注意力层。该网络本质上对物体位姿和点的排列具有不变性,能够生成几何一致且完整度高的点云。在多种局部点云上的实验表明,3DSGrasp在点云补全任务中优于当前最优方法,并在实际场景中大幅提升了抓取成功率。代码与数据集将在论文录用后公开。