We introduce SPAFormer, an innovative model designed to overcome the combinatorial explosion challenge in the 3D Part Assembly (3D-PA) task. This task requires accurate prediction of each part's pose and shape in sequential steps, and as the number of parts increases, the possible assembly combinations increase exponentially, leading to a combinatorial explosion that severely hinders the efficacy of 3D-PA. SPAFormer addresses this problem by leveraging weak constraints from assembly sequences, effectively reducing the solution space's complexity. Since assembly part sequences convey construction rules similar to sentences being structured through words, our model explores both parallel and autoregressive generation. It further enhances assembly through knowledge enhancement strategies that utilize the attributes of parts and their sequence information, enabling it to capture the inherent assembly pattern and relationships among sequentially ordered parts. We also construct a more challenging benchmark named PartNet-Assembly covering 21 varied categories to more comprehensively validate the effectiveness of SPAFormer. Extensive experiments demonstrate the superior generalization capabilities of SPAFormer, particularly with multi-tasking and in scenarios requiring long-horizon assembly. Codes and model weights will be released at https://github.com/xuboshen/SPAFormer.
翻译:本文提出SPAFormer,一种旨在克服三维部件装配任务中组合爆炸挑战的创新模型。该任务需要按序准确预测每个部件的姿态与形状,随着部件数量增加,可能的装配组合呈指数级增长,导致组合爆炸问题严重制约三维部件装配的效能。SPAFormer通过利用装配序列中的弱约束条件,有效降低解空间的复杂度。由于装配部件序列传递的构建规则类似于通过词语组织句子的过程,本模型同时探索了并行与自回归生成方式。通过采用融合部件属性及其序列信息的知识增强策略,模型进一步提升了装配能力,使其能够捕捉序列化部件间的内在装配模式与关联关系。我们还构建了涵盖21个不同类别的更具挑战性基准测试集PartNet-Assembly,以更全面地验证SPAFormer的有效性。大量实验表明SPAFormer具有卓越的泛化能力,尤其在多任务处理和长程装配场景中表现突出。代码与模型权重发布于https://github.com/xuboshen/SPAFormer。