We introduce Equivariant Neural Field Expectation Maximization (EFEM), a simple, effective, and robust geometric algorithm that can segment objects in 3D scenes without annotations or training on scenes. We achieve such unsupervised segmentation by exploiting single object shape priors. We make two novel steps in that direction. First, we introduce equivariant shape representations to this problem to eliminate the complexity induced by the variation in object configuration. Second, we propose a novel EM algorithm that can iteratively refine segmentation masks using the equivariant shape prior. We collect a novel real dataset Chairs and Mugs that contains various object configurations and novel scenes in order to verify the effectiveness and robustness of our method. Experimental results demonstrate that our method achieves consistent and robust performance across different scenes where the (weakly) supervised methods may fail. Code and data available at https://www.cis.upenn.edu/~leijh/projects/efem
翻译:我们提出了等变神经场期望最大化(EFEM),这是一种简单、有效且稳健的几何算法,能够在无需标注或场景训练的情况下对三维场景中的物体进行分割。通过利用单个物体的形状先验,我们实现了这种无监督分割。为此,我们做出了两项创新。首先,我们引入了等变形状表示来解决该问题,从而消除由物体构型变化带来的复杂性。其次,我们提出了一种新颖的EM算法,该算法能够使用等变形状先验迭代地细化分割掩码。我们收集了一个包含多种物体构型和新颖场景的真实数据集——Chairs and Mugs,以验证我们方法的有效性和稳健性。实验结果表明,我们的方法在不同场景中均能实现一致且稳健的性能,而(弱)监督方法在这些场景中可能失败。代码和数据可在 https://www.cis.upenn.edu/~leijh/projects/efem 获取。