We introduce PartSTAD, a method designed for the task adaptation of 2D-to-3D segmentation lifting. Recent studies have highlighted the advantages of utilizing 2D segmentation models to achieve high-quality 3D segmentation through few-shot adaptation. However, previous approaches have focused on adapting 2D segmentation models for domain shift to rendered images and synthetic text descriptions, rather than optimizing the model specifically for 3D segmentation. Our proposed task adaptation method finetunes a 2D bounding box prediction model with an objective function for 3D segmentation. We introduce weights for 2D bounding boxes for adaptive merging and learn the weights using a small additional neural network. Additionally, we incorporate SAM, a foreground segmentation model on a bounding box, to improve the boundaries of 2D segments and consequently those of 3D segmentation. Our experiments on the PartNet-Mobility dataset show significant improvements with our task adaptation approach, achieving a 7.0%p increase in mIoU and a 5.2%p improvement in mAP_50 for semantic and instance segmentation compared to the SotA few-shot 3D segmentation model.
翻译:我们提出PartSTAD方法,用于二维到三维分割提升的任务适配。近期研究揭示了利用二维分割模型通过少样本适配实现高质量三维分割的优势。然而,现有方法主要针对二维分割模型在渲染图像与合成文本描述域迁移中的适配,而非针对三维分割任务对模型进行专门优化。我们提出的任务适配方法通过三维分割的目标函数微调二维边界框预测模型。我们引入二维边界框权重实现自适应融合,并通过小型附加神经网络学习该权重。此外,我们整合基于边界框的前景分割模型SAM,以改善二维分割的边界质量,进而提升三维分割精度。在PartNet-Mobility数据集上的实验表明,与当前最先进的少样本三维分割模型相比,我们的任务适配方法实现了语义分割mIoU提升7.0个百分点,实例分割mAP_50提升5.2个百分点。