Shape assembly, which aims to reassemble separate parts into a complete object, has gained significant interest in recent years. Existing methods primarily rely on networks to predict the poses of individual parts, but often fail to effectively capture the geometric interactions between the parts and their poses. In this paper, we present the Geometric Point Attention Transformer (GPAT), a network specifically designed to address the challenges of reasoning about geometric relationships. In the geometric point attention module, we integrate both global shape information and local pairwise geometric features, along with poses represented as rotation and translation vectors for each part. To enable iterative updates and dynamic reasoning, we introduce a geometric recycling scheme, where each prediction is fed into the next iteration for refinement. We evaluate our model on both the semantic and geometric assembly tasks, showing that it outperforms previous methods in absolute pose estimation, achieving accurate pose predictions and high alignment accuracy.
翻译:形状组装旨在将分离部件重新组装成完整物体,近年来受到广泛关注。现有方法主要依赖网络预测各部件位姿,但往往难以有效捕捉部件间及其位姿的几何交互关系。本文提出几何点注意力Transformer(GPAT),该网络专门设计用于解决几何关系推理的挑战。在几何点注意力模块中,我们整合了全局形状信息与局部成对几何特征,以及以旋转和平移向量表示的部件位姿。为实现迭代更新与动态推理,我们引入几何循环机制,将每次预测结果输入后续迭代进行优化。我们在语义与几何组装任务上评估模型性能,结果表明该方法在绝对位姿估计方面优于现有方法,实现了精确的位姿预测与高对齐精度。