Segment Any Point Cloud Sequences by Distilling Vision Foundation Models

Recent advancements in vision foundation models (VFMs) have opened up new possibilities for versatile and efficient visual perception. In this work, we introduce Seal, a novel framework that harnesses VFMs for segmenting diverse automotive point cloud sequences. Seal exhibits three appealing properties: i) Scalability: VFMs are directly distilled into point clouds, eliminating the need for annotations in either 2D or 3D during pretraining. ii) Consistency: Spatial and temporal relationships are enforced at both the camera-to-LiDAR and point-to-segment stages, facilitating cross-modal representation learning. iii) Generalizability: Seal enables knowledge transfer in an off-the-shelf manner to downstream tasks involving diverse point clouds, including those from real/synthetic, low/high-resolution, large/small-scale, and clean/corrupted datasets. Extensive experiments conducted on eleven different point cloud datasets showcase the effectiveness and superiority of Seal. Notably, Seal achieves a remarkable 45.0% mIoU on nuScenes after linear probing, surpassing random initialization by 36.9% mIoU and outperforming prior arts by 6.1% mIoU. Moreover, Seal demonstrates significant performance gains over existing methods across 20 different few-shot fine-tuning tasks on all eleven tested point cloud datasets.

翻译：近期视觉基础模型（VFMs）的进展为通用高效视觉感知开辟了新可能。本文提出Seal这一创新框架，利用VFMs分割多样化自动驾驶点云序列。Seal具有三个吸引人的特性：i）可扩展性：VFMs被直接蒸馏到点云中，预训练阶段无需2D或3D标注；ii）一致性：在相机到激光雷达及点到片段两个阶段强制实施时空关系约束，促进跨模态表示学习；iii）泛化性：Seal能够以开箱即用方式将知识迁移至涉及多样化点云的下游任务，涵盖真实/合成、低/高分辨率、大/小尺度及干净/损坏数据集。在十一个不同点云数据集上的广泛实验展示了Seal的有效性与优越性。值得注意的是，Seal在nuScenes数据集上经线性探测后达到45.0% mIoU，较随机初始化提升36.9% mIoU，并超越先前最优方法6.1% mIoU。此外，在所有十一个点云数据集的20项不同少样本微调任务中，Seal均展现出优于现有方法的显著性能提升。

相关内容

点云

关注 50

根据激光测量原理得到的点云，包括三维坐标（XYZ）和激光反射强度（Intensity）。根据摄影测量原理得到的点云，包括三维坐标（XYZ）和颜色信息（RGB）。结合激光测量和摄影测量原理得到点云，包括三维坐标（XYZ）、激光反射强度（Intensity）和颜色信息（RGB）。在获取物体表面每个采样点的空间坐标后，得到的是一个点的集合，称之为“点云”(Point Cloud)

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日