3D instance segmentation is crucial for obtaining an understanding of a point cloud scene. This paper presents a novel neural network architecture for performing instance segmentation on 3D point clouds. We propose to jointly learn coefficients and prototypes in parallel which can be combined to obtain the instance predictions. The coefficients are computed using an overcomplete set of sampled points with a novel multi-scale module, dubbed dilated point inception. As the set of obtained instance mask predictions is overcomplete, we employ a non-maximum suppression algorithm to retrieve the final predictions. This approach allows to omit the time-expensive clustering step and leads to a more stable inference time. The proposed method is not only 28% faster than the state-of-the-art, it also exhibits the lowest standard deviation. Our experiments have shown that the standard deviation of the inference time is only 1.0% of the total time while it ranges between 10.8 and 53.1% for the state-of-the-art methods. Lastly, our method outperforms the state-of-the-art both on S3DIS-blocks (4.9% in mRec on Fold-5) and PartNet (2.0% on average in mAP).
翻译:三维实例分割对于理解点云场景至关重要。本文提出了一种新颖的神经网络架构,用于对三维点云进行实例分割。我们提出并行联合学习系数和原型,二者可结合以获得实例预测。系数通过使用一组过完备的采样点计算得出,并采用一种新颖的多尺度模块——扩张点初始模块。由于获得的实例掩码预测集合是过完备的,我们采用非极大值抑制算法来获取最终预测。该方法能够省去耗时的聚类步骤,从而实现更稳定的推理时间。所提出的方法不仅比现有最优方法快28%,还表现出最低的标准差。我们的实验表明,其推理时间的标准差仅为总时间的1.0%,而现有最优方法的这一数值在10.8%至53.1%之间。最后,我们的方法在S3DIS-blocks数据集(Fold-5上的mRec提升4.9%)和PartNet数据集(平均mAP提升2.0%)上均优于现有最优方法。