A truly generalizable approach to rigid segmentation and motion estimation is fundamental to 3D understanding of articulated objects and moving scenes. In view of the tightly coupled relationship between segmentation and motion estimates, we present an SE(3) equivariant architecture and a training strategy to tackle this task in an unsupervised manner. Our architecture comprises two lightweight and inter-connected heads that predict segmentation masks using point-level invariant features and motion estimates from SE(3) equivariant features without the prerequisites of category information. Our unified training strategy can be performed online while jointly optimizing the two predictions by exploiting the interrelations among scene flow, segmentation mask, and rigid transformations. We show experiments on four datasets as evidence of the superiority of our method both in terms of model performance and computational efficiency with only 0.25M parameters and 0.92G FLOPs. To the best of our knowledge, this is the first work designed for category-agnostic part-level SE(3) equivariance in dynamic point clouds.
翻译:一种真正具有泛化能力的刚性分割与运动估计方法,对于理解铰接物体和运动场景的三维结构至关重要。鉴于分割与运动估计之间的紧密耦合关系,我们提出了一种基于SE(3)等变架构与训练策略的无监督解决方案。该架构包含两个轻量级且相互关联的预测头:一个利用点级不变特征预测分割掩码,另一个基于SE(3)等变特征估计运动参数,无需任何类别先验信息。通过联合优化场景流、分割掩码与刚性变换之间的内在关联,我们的统一训练策略可在在线学习中同步提升两类预测性能。在四个数据集上的实验表明,本方法仅需0.25M参数和0.92G FLOPs即可在模型性能与计算效率上均取得优势。据我们所知,这是首个面向动态点云中类别无关的部件级SE(3)等变性的研究工作。