Existing methods for interactive segmentation in radiance fields entail scene-specific optimization and thus cannot generalize across different scenes, which greatly limits their applicability. In this work we make the first attempt at Scene-Generalizable Interactive Segmentation in Radiance Fields (SGISRF) and propose a novel SGISRF method, which can perform 3D object segmentation for novel (unseen) scenes represented by radiance fields, guided by only a few interactive user clicks in a given set of multi-view 2D images. In particular, the proposed SGISRF focuses on addressing three crucial challenges with three specially designed techniques. First, we devise the Cross-Dimension Guidance Propagation to encode the scarce 2D user clicks into informative 3D guidance representations. Second, the Uncertainty-Eliminated 3D Segmentation module is designed to achieve efficient yet effective 3D segmentation. Third, Concealment-Revealed Supervised Learning scheme is proposed to reveal and correct the concealed 3D segmentation errors resulted from the supervision in 2D space with only 2D mask annotations. Extensive experiments on two real-world challenging benchmarks covering diverse scenes demonstrate 1) effectiveness and scene-generalizability of the proposed method, 2) favorable performance compared to classical method requiring scene-specific optimization.
翻译:现有的辐射场交互式分割方法依赖于逐场景优化,无法跨场景泛化,这极大地限制了其应用范围。本文首次尝试实现辐射场场景泛化交互式分割(SGISRF),并提出一种新颖的SGISRF方法,该方法仅需在给定多视图二维图像集中提供少量交互式用户点击,即可对由辐射场表示的新颖(未见过)场景执行三维物体分割。具体而言,所提出的SGISRF聚焦于解决三个关键挑战,并设计了三种专门技术。首先,我们设计了跨维度引导传播机制,将稀疏的二维用户点击编码为信息丰富的三维引导表示。其次,构建了不确定性消除的三维分割模块,以实现高效且有效的三维分割。第三,提出了隐匿揭示监督学习方案,用于揭示并修正由二维空间中以二维掩码标注为监督信号所导致的三维分割隐匿误差。在两个覆盖多样场景的真实世界挑战性基准上的大量实验表明:1)所提方法具有有效性和场景泛化能力,2)与需要逐场景优化的经典方法相比性能优越。