We present iSeg, a new interactive technique for segmenting 3D shapes. Previous works have focused mainly on leveraging pre-trained 2D foundation models for 3D segmentation based on text. However, text may be insufficient for accurately describing fine-grained spatial segmentations. Moreover, achieving a consistent 3D segmentation using a 2D model is challenging since occluded areas of the same semantic region may not be visible together from any 2D view. Thus, we design a segmentation method conditioned on fine user clicks, which operates entirely in 3D. Our system accepts user clicks directly on the shape's surface, indicating the inclusion or exclusion of regions from the desired shape partition. To accommodate various click settings, we propose a novel interactive attention module capable of processing different numbers and types of clicks, enabling the training of a single unified interactive segmentation model. We apply iSeg to a myriad of shapes from different domains, demonstrating its versatility and faithfulness to the user's specifications. Our project page is at https://threedle.github.io/iSeg/.
翻译:我们提出了一种名为iSeg的新型交互式三维形状分割技术。以往的研究主要侧重于利用预训练的二维基础模型进行基于文本的三维分割,然而,文本可能无法精确描述细粒度的空间分割。此外,使用二维模型实现一致的三维分割具有挑战性,因为同一语义区域的遮挡部分可能无法从任何二维视角同时观察到。为此,我们设计了一种基于用户精细点击的全三维分割方法。系统直接接受用户在形状表面上的点击,以指示在期望的形状划分中包含或排除某些区域。为适应不同的点击设置,我们提出了一种新颖的交互式注意力模块,能够处理不同数量和类型的点击,从而训练统一的交互式分割模型。我们在来自不同领域的多种形状上应用iSeg,展示了其多功能性及对用户规格的忠实性。项目页面:https://threedle.github.io/iSeg/.