Dense 3D correspondence can enhance robotic manipulation by enabling the generalization of spatial, functional, and dynamic information from one object to an unseen counterpart. Compared to shape correspondence, semantic correspondence is more effective in generalizing across different object categories. To this end, we present DenseMatcher, a method capable of computing 3D correspondences between in-the-wild objects that share similar structures. DenseMatcher first computes vertex features by projecting multiview 2D features onto meshes and refining them with a 3D network, and subsequently finds dense correspondences with the obtained features using functional map. In addition, we craft the first 3D matching dataset that contains colored object meshes across diverse categories. In our experiments, we show that DenseMatcher significantly outperforms prior 3D matching baselines by 43.5%. We demonstrate the downstream effectiveness of DenseMatcher in (i) robotic manipulation, where it achieves cross-instance and cross-category generalization on long-horizon complex manipulation tasks from observing only one demo; (ii) zero-shot color mapping between digital assets, where appearance can be transferred between different objects with relatable geometry.
翻译:密集3D对应能够通过将空间、功能及动态信息从一个物体泛化到未见过的对应物体,从而增强机器人操作能力。相较于形状对应,语义对应在跨不同物体类别泛化方面更为有效。为此,我们提出DenseMatcher,一种能够计算具有相似结构的真实世界物体间3D对应关系的方法。DenseMatcher首先通过将多视角2D特征投影到网格上并利用3D网络进行精化来计算顶点特征,随后使用功能映射基于所得特征寻找密集对应关系。此外,我们构建了首个包含跨多样类别彩色物体网格的3D匹配数据集。实验表明,DenseMatcher以43.5%的显著优势超越现有3D匹配基线方法。我们验证了DenseMatcher在下游任务中的有效性:(i)在机器人操作中,仅通过观察一次演示即可在长时程复杂操作任务上实现跨实例与跨类别泛化;(ii)在数字资产间的零样本色彩映射中,可将外观在具有关联几何结构的不同物体间进行迁移。