3D object reconstruction and multilevel segmentation are fundamental to computer vision research. Existing algorithms usually perform 3D scene reconstruction and target objects segmentation independently, and the performance is not fully guaranteed due to the challenge of the 3D segmentation. Here we propose an open-source one stop 3D target reconstruction and multilevel segmentation framework (OSTRA), which performs segmentation on 2D images, tracks multiple instances with segmentation labels in the image sequence, and then reconstructs labelled 3D objects or multiple parts with Multi-View Stereo (MVS) or RGBD-based 3D reconstruction methods. We extend object tracking and 3D reconstruction algorithms to support continuous segmentation labels to leverage the advances in the 2D image segmentation, especially the Segment-Anything Model (SAM) which uses the pretrained neural network without additional training for new scenes, for 3D object segmentation. OSTRA supports most popular 3D object models including point cloud, mesh and voxel, and achieves high performance for semantic segmentation, instance segmentation and part segmentation on several 3D datasets. It even surpasses the manual segmentation in scenes with complex structures and occlusions. Our method opens up a new avenue for reconstructing 3D targets embedded with rich multi-scale segmentation information in complex scenes. OSTRA is available from https://github.com/ganlab/OSTRA.
翻译:三维目标重建与多层次分割是计算机视觉研究的基础问题。现有算法通常独立执行三维场景重建与目标分割,由于三维分割的挑战性,其性能难以得到充分保障。本文提出开源一站式三维目标重建与多层次分割框架(OSTRA),该框架对二维图像进行分割,在图像序列中跟踪带有分割标签的多实例,进而通过多视图立体(MVS)或基于RGBD的三维重建方法重建带标签的三维目标或多部件。我们扩展了目标跟踪与三维重建算法,使其支持连续的分割标签,从而利用二维图像分割领域的最新进展(特别是无需额外训练即可适应新场景的预训练神经网络Segment-Anything模型(SAM))实现三维目标分割。OSTRA支持最主流的三维目标模型(包括点云、网格和体素),并在多个三维数据集上实现了语义分割、实例分割和部件分割的高性能表现,甚至在具有复杂结构和遮挡的场景中超越了人工分割。本方法为重建嵌入丰富多尺度分割信息的复杂场景三维目标开辟了新途径。OSTRA代码已开源在https://github.com/ganlab/OSTRA。