Multi-modality Affinity Inference for Weakly Supervised 3D Semantic Segmentation

3D point cloud semantic segmentation has a wide range of applications. Recently, weakly supervised point cloud segmentation methods have been proposed, aiming to alleviate the expensive and laborious manual annotation process by leveraging scene-level labels. However, these methods have not effectively exploited the rich geometric information (such as shape and scale) and appearance information (such as color and texture) present in RGB-D scans. Furthermore, current approaches fail to fully leverage the point affinity that can be inferred from the feature extraction network, which is crucial for learning from weak scene-level labels. Additionally, previous work overlooks the detrimental effects of the long-tailed distribution of point cloud data in weakly supervised 3D semantic segmentation. To this end, this paper proposes a simple yet effective scene-level weakly supervised point cloud segmentation method with a newly introduced multi-modality point affinity inference module. The point affinity proposed in this paper is characterized by features from multiple modalities (e.g., point cloud and RGB), and is further refined by normalizing the classifier weights to alleviate the detrimental effects of long-tailed distribution without the need of the prior of category distribution. Extensive experiments on the ScanNet and S3DIS benchmarks verify the effectiveness of our proposed method, which outperforms the state-of-the-art by ~4% to ~6% mIoU. Codes are released at https://github.com/Sunny599/AAAI24-3DWSSG-MMA.

翻译：三维点云语义分割具有广泛的应用前景。近年来，为缓解昂贵且费时的人工标注过程，研究者提出了利用场景级标签的弱监督点云分割方法。然而，这些方法未能有效利用RGB-D扫描中蕴含的丰富几何信息（如形状和尺度）与外观信息（如颜色和纹理）。此外，现有方法未能充分挖掘可从特征提取网络推断出的点亲和关系，而这对从弱场景级标签中学习至关重要。同时，先前工作忽视了弱监督三维语义分割中点云数据长尾分布带来的负面影响。为此，本文提出一种简洁有效的场景级弱监督点云分割方法，并引入新颖的多模态点亲和推理模块。本文所提出的点亲和关系通过多模态（如点云与RGB）特征进行表征，进一步通过归一化分类器权重进行优化，以缓解长尾分布带来的负面影响，且无需预知类别分布先验。在ScanNet和S3DIS基准上的大量实验验证了所提方法的有效性，其平均交并比（mIoU）相较于当前最优方法提升约4%至6%。代码已发布于https://github.com/Sunny599/AAAI24-3DWSSG-MMA。

相关内容

点云

关注 50

根据激光测量原理得到的点云，包括三维坐标（XYZ）和激光反射强度（Intensity）。根据摄影测量原理得到的点云，包括三维坐标（XYZ）和颜色信息（RGB）。结合激光测量和摄影测量原理得到点云，包括三维坐标（XYZ）、激光反射强度（Intensity）和颜色信息（RGB）。在获取物体表面每个采样点的空间坐标后，得到的是一个点的集合，称之为“点云”(Point Cloud)

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日