Reliable object grasping is a crucial capability for autonomous robots. However, many existing grasping approaches focus on general clutter removal without explicitly modeling objects and thus only relying on the visible local geometry. We introduce CenterGrasp, a novel framework that combines object awareness and holistic grasping. CenterGrasp learns a general object prior by encoding shapes and valid grasps in a continuous latent space. It consists of an RGB-D image encoder that leverages recent advances to detect objects and infer their pose and latent code, and a decoder to predict shape and grasps for each object in the scene. We perform extensive experiments on simulated as well as real-world cluttered scenes and demonstrate strong scene reconstruction and 6-DoF grasp-pose estimation performance. Compared to the state of the art, CenterGrasp achieves an improvement of 38.5 mm in shape reconstruction and 33 percentage points on average in grasp success. We make the code and trained models publicly available at http://centergrasp.cs.uni-freiburg.de.
翻译:可靠的目标抓取是自主机器人的关键能力。然而,现有许多抓取方法聚焦于通用杂乱场景清理,未显式建模物体,仅依赖可见局部几何信息。我们提出CenterGrasp——一种融合物体感知与整体抓取的新型框架。CenterGrasp通过在连续隐空间中编码形状与有效抓取,学习通用物体先验知识。该框架由RGB-D图像编码器与解码器构成:编码器利用最新技术检测物体、推断其姿态与隐编码,解码器预测场景中每个物体的形状与抓取位姿。针对仿真及真实世界杂乱场景的广泛实验表明,本方法在场景重建与六自由度抓取位姿估计中性能优异。与现有技术相比,CenterGrasp在形状重建精度上提升38.5毫米,抓取成功率平均提升33个百分点。我们已在http://centergrasp.cs.uni-freiburg.de公开代码与训练模型。