Interactive segmentation of 3D Gaussians opens a great opportunity for real-time manipulation of 3D scenes thanks to the real-time rendering capability of 3D Gaussian Splatting. However, the current methods suffer from time-consuming post-processing to deal with noisy segmentation output. Also, they struggle to provide detailed segmentation, which is important for fine-grained manipulation of 3D scenes. In this study, we propose Click-Gaussian, which learns distinguishable feature fields of two-level granularity, facilitating segmentation without time-consuming post-processing. We delve into challenges stemming from inconsistently learned feature fields resulting from 2D segmentation obtained independently from a 3D scene. 3D segmentation accuracy deteriorates when 2D segmentation results across the views, primary cues for 3D segmentation, are in conflict. To overcome these issues, we propose Global Feature-guided Learning (GFL). GFL constructs the clusters of global feature candidates from noisy 2D segments across the views, which smooths out noises when training the features of 3D Gaussians. Our method runs in 10 ms per click, 15 to 130 times as fast as the previous methods, while also significantly improving segmentation accuracy. Our project page is available at https://seokhunchoi.github.io/Click-Gaussian
翻译:3D高斯模型的交互式分割凭借3D高斯泼溅的实时渲染能力,为三维场景的实时操控开辟了广阔前景。然而,现有方法需要耗时的后处理来应对噪声分割输出,且难以提供精细的分割结果,而这对于三维场景的细粒度操控至关重要。本研究提出Click-Gaussian方法,通过学习具有两级粒度的可区分特征场,实现了无需耗时后处理的分割。我们深入探讨了由独立于三维场景获取的二维分割所导致特征场学习不一致的难题:当作为三维分割主要线索的多视角二维分割结果存在冲突时,三维分割精度会显著下降。为克服这些问题,我们提出全局特征引导学习框架。该框架通过整合多视角噪声二维分割片段构建全局特征候选簇,在训练3D高斯特征时有效平滑噪声。我们的方法在每次点击交互中仅需10毫秒处理时间,较现有方法提速15至130倍,同时显著提升了分割精度。项目页面详见:https://seokhunchoi.github.io/Click-Gaussian