We propose a method for 3D shape reconstruction from unoriented point clouds. Our method consists of a novel SE(3)-equivariant coordinate-based network (TF-ONet), that parametrizes the occupancy field of the shape and respects the inherent symmetries of the problem. In contrast to previous shape reconstruction methods that align the input to a regular grid, we operate directly on the irregular point cloud. Our architecture leverages equivariant attention layers that operate on local tokens. This mechanism enables local shape modelling, a crucial property for scalability to large scenes. Given an unoriented, sparse, noisy point cloud as input, we produce equivariant features for each point. These serve as keys and values for the subsequent equivariant cross-attention blocks that parametrize the occupancy field. By querying an arbitrary point in space, we predict its occupancy score. We show that our method outperforms previous SO(3)-equivariant methods, as well as non-equivariant methods trained on SO(3)-augmented datasets. More importantly, local modelling together with SE(3)-equivariance create an ideal setting for SE(3) scene reconstruction. We show that by training only on single, aligned objects and without any pre-segmentation, we can reconstruct novel scenes containing arbitrarily many objects in random poses without any performance loss.
翻译:我们提出一种从无方向点云进行三维形状重建的方法。该方法包含一个新颖的SE(3)等变坐标网络(TF-ONet),该网络参数化形状占据场,并尊重问题的内在对称性。与以往将输入对齐到规则网格的形状重建方法不同,我们直接在非规则点云上操作。我们的架构利用作用于局部标记的等变注意力层。这一机制能够实现局部形状建模,这对大规模场景的可扩展性至关重要。给定一个无方向、稀疏且含噪声的点云作为输入,我们为每个点生成等变特征。这些特征作为后续等变交叉注意力块的键和值,参数化占据场。通过查询空间中的任意点,我们预测其占据得分。我们证明,该方法在性能上优于先前基于SO(3)等变的方法,以及基于SO(3)增强数据集训练的非等变方法。更重要的是,局部建模与SE(3)等变性共同为SE(3)场景重建创造了理想条件。我们表明,仅在单个对齐物体上训练且无需任何预分割的情况下,该方法能够重建包含随机姿态下任意数量物体的新场景,且性能无任何损失。