We introduce SPAFormer, an innovative model designed to overcome the combinatorial explosion challenge in the 3D Part Assembly (3D-PA) task. This task requires accurate prediction of each part's pose and shape in sequential steps, and as the number of parts increases, the possible assembly combinations increase exponentially, leading to a combinatorial explosion that severely hinders the efficacy of 3D-PA. SPAFormer addresses this problem by leveraging weak constraints from assembly sequences, effectively reducing the solution space's complexity. Since assembly part sequences convey construction rules similar to sentences being structured through words, our model explores both parallel and autoregressive generation. It further enhances assembly through knowledge enhancement strategies that utilize the attributes of parts and their sequence information, enabling it to capture the inherent assembly pattern and relationships among sequentially ordered parts. We also construct a more challenging benchmark named PartNet-Assembly covering 21 varied categories to more comprehensively validate the effectiveness of SPAFormer. Extensive experiments demonstrate the superior generalization capabilities of SPAFormer, particularly with multi-tasking and in scenarios requiring long-horizon assembly. Codes and model weights will be released at \url{https://github.com/xuboshen/SPAFormer}.
翻译:我们提出SPAFormer,一种旨在解决三维零件装配(3D-PA)任务中组合爆炸挑战的创新模型。该任务需逐步精确预测每个零件的位姿与形状,而随着零件数量增加,可能的装配组合呈指数级增长,由此引发的组合爆炸严重制约了3D-PA的有效性。SPAFormer通过利用装配序列中的弱约束来降低解空间的复杂度,从而解决了这一问题。由于装配零件序列如同单词构成句子般传递构建规则,我们的模型同时探索了并行生成与自回归生成两种模式。进一步地,该模型通过知识增强策略——利用零件属性及其序列信息——强化装配能力,使其能够捕获序列化零件间的内在装配模式与关联关系。我们还构建了名为PartNet-Assembly的更具挑战性的基准数据集,涵盖21个多样化类别,以更全面地验证SPAFormer的有效性。大量实验表明,SPAFormer具有卓越的泛化能力,尤其在多任务处理和长时序装配场景中表现突出。代码与模型权重将在\url{https://github.com/xuboshen/SPAFormer} 开源。