Removing clutter from scenes is essential in many applications, ranging from privacy-concerned content filtering to data augmentation. In this work, we present an automatic system that removes clutter from 3D scenes and inpaints with coherent geometry and texture. We propose techniques for its two key components: 3D segmentation from shared properties and 3D inpainting, both of which are important porblems. The definition of 3D scene clutter (frequently-moving objects) is not well captured by commonly-studied object categories in computer vision. To tackle the lack of well-defined clutter annotations, we group noisy fine-grained labels, leverage virtual rendering, and impose an instance-level area-sensitive loss. Once clutter is removed, we inpaint geometry and texture in the resulting holes by merging inpainted RGB-D images. This requires novel voting and pruning strategies that guarantee multi-view consistency across individually inpainted images for mesh reconstruction. Experiments on ScanNet and Matterport dataset show that our method outperforms baselines for clutter segmentation and 3D inpainting, both visually and quantitatively.
翻译:在许多应用中,从涉及隐私的内容过滤到数据增强,移除场景中的杂波至关重要。本研究提出了一种自动系统,可移除三维场景中的杂波,并以连贯的几何结构与纹理进行修复。我们针对其两个关键组成部分提出了相关技术:基于共享属性的三维分割与三维修复,两者均为重要问题。三维场景杂波(频繁移动的物体)的定义并未被计算机视觉中常见的物体类别充分描述。为解决缺乏明确定义的杂波标注问题,我们整合了噪声细粒度标签、利用虚拟渲染,并引入实例级区域敏感损失函数。移除杂波后,我们通过合并修复后的RGB-D图像来修复所产生的空洞中的几何与纹理。这需要新颖的投票与剪枝策略,以确保单个修复图像之间的多视图一致性,从而进行网格重建。在ScanNet与Matterport数据集上的实验表明,我们的方法在杂波分割与三维修复方面,无论在视觉质量还是定量指标上均优于基线方法。