Many existing learning-based grasping approaches concentrate on a single embodiment, provide limited generalization to higher DoF end-effectors and cannot capture a diverse set of grasp modes. We tackle the problem of grasping using multiple embodiments by learning rich geometric representations for both objects and end-effectors using Graph Neural Networks. Our novel method - GeoMatch - applies supervised learning on grasping data from multiple embodiments, learning end-to-end contact point likelihood maps as well as conditional autoregressive predictions of grasps keypoint-by-keypoint. We compare our method against baselines that support multiple embodiments. Our approach performs better across three end-effectors, while also producing diverse grasps. Examples, including real robot demos, can be found at geo-match.github.io.
翻译:许多现有的基于学习的抓取方法专注于单一机械结构,对高自由度末端执行器的泛化能力有限,且无法捕获多样化的抓取模式。我们通过利用图神经网络学习物体和末端执行器的丰富几何表示,解决了多形态抓取问题。我们提出的新方法——GeoMatch——对来自多种机械结构的抓取数据进行监督学习,端到端地学习接触点概率图以及逐关键点的条件自回归抓取预测。我们将该方法与支持多种机械结构的基线方法进行了比较。我们的方法在三种末端执行器上均表现更优,同时能生成多样化的抓取。示例(包括真实机器人演示)可在geo-match.github.io查看。