Reliable object grasping is a crucial capability for autonomous robots. However, many existing grasping approaches focus on general clutter removal without explicitly modeling objects and thus only relying on the visible local geometry. We introduce CenterGrasp, a novel framework that combines object awareness and holistic grasping. CenterGrasp learns a general object prior by encoding shapes and valid grasps in a continuous latent space. It consists of an RGB-D image encoder that leverages recent advances to detect objects and infer their pose and latent code, and a decoder to predict shape and grasps for each object in the scene. We perform extensive experiments on simulated as well as real-world cluttered scenes and demonstrate strong scene reconstruction and 6-DoF grasp-pose estimation performance. Compared to the state of the art, CenterGrasp achieves an improvement of 38.5 mm in shape reconstruction and 33 percentage points on average in grasp success. We make the code and trained models publicly available at http://centergrasp.cs.uni-freiburg.de.
翻译:可靠的物体抓取是自主机器人的关键能力。然而,现有许多抓取方法专注于通用场景中的杂乱物体移除,未显式建模物体,仅依赖可见局部几何信息。我们提出CenterGrasp,一种融合对象意识与整体抓取的新型框架。CenterGrasp通过在连续隐空间中编码形状和有效抓取,学习通用的物体先验知识。该框架包含一个RGB-D图像编码器,利用最新技术检测物体并推断其姿态和隐码;以及一个解码器,用于预测场景中每个物体的形状和抓取方式。我们在模拟和真实杂乱场景中进行了大量实验,展示了强大的场景重建和6自由度抓取姿态估计性能。与现有技术相比,CenterGrasp在形状重建上提升了38.5毫米,抓取成功率平均提高33个百分点。我们已在http://centergrasp.cs.uni-freiburg.de公开代码与预训练模型。