CoSSegGaussians: Compact and Swift Scene Segmenting 3D Gaussians

We propose Compact and Swift Segmenting 3D Gaussians(CoSSegGaussians), a method for compact 3D-consistent scene segmentation at fast rendering speed with only RGB images input. Previous NeRF-based 3D segmentation methods have relied on implicit or voxel neural scene representation and ray-marching volume rendering which are time consuming. Recent 3D Gaussian Splatting significantly improves the rendering speed, however, existing Gaussians-based segmentation methods(eg: Gaussian Grouping) fail to provide compact segmentation masks especially in zero-shot segmentation, which is mainly caused by the lack of robustness and compactness for straightforwardly assigning learnable parameters to each Gaussian when encountering inconsistent 2D machine-generated labels. Our method aims to achieve compact and reliable zero-shot scene segmentation swiftly by mapping fused spatial and semantically meaningful features for each Gaussian point with a shallow decoding network. Specifically, our method firstly optimizes Gaussian points' position, convariance and color attributes under the supervision of RGB images. After Gaussian Locating, we distill multi-scale DINO features extracted from images through unprojection to each Gaussian, which is then incorporated with spatial features from the fast point features processing network, i.e. RandLA-Net. Then the shallow decoding MLP is applied to the multi-scale fused features to obtain compact segmentation. Experimental results show that our model can perform high-quality zero-shot scene segmentation, as our model outperforms other segmentation methods on both semantic and panoptic segmentation task, meanwhile consumes approximately only 10% segmenting time compared to NeRF-based segmentation. Code and more results will be available at https://David-Dou.github.io/CoSSegGaussians

翻译：我们提出紧凑且快速的场景分割3D高斯方法（CoSSegGaussians），一种仅需RGB图像输入即可实现紧凑的3D一致性场景分割并达到快速渲染速度的方法。以往的基于NeRF的3D分割方法依赖于隐式或体素神经场景表示以及光线步进体渲染，这一过程耗时较长。近年来的3D高斯泼溅技术显著提升了渲染速度，然而，现有基于高斯的分割方法（例如Gaussian Grouping）无法提供紧凑的分割掩码，尤其是在零样本分割场景中，这主要是由于在遇到不一致的2D机器生成标签时，为每个高斯直接分配可学习参数缺乏鲁棒性和紧凑性。我们的方法旨在通过利用浅层解码网络，为每个高斯点映射融合后的空间特征与语义上有意义的特征，从而快速实现紧凑且可靠的零样本场景分割。具体而言，我们的方法首先在RGB图像的监督下优化高斯点的位置、协方差和颜色属性。在高斯定位之后，我们通过反投影将图像中提取的多尺度DINO特征蒸馏到每个高斯点上，随后将其与来自快速点特征处理网络（即RandLA-Net）的空间特征相结合。然后，将浅层解码MLP应用于多尺度融合特征，以获得紧凑的分割结果。实验结果表明，我们的模型能够执行高质量的零样本场景分割，在语义分割和全景分割任务上均优于其他分割方法，同时其分割时间仅为基于NeRF分割方法的约10%。代码及更多结果将在https://David-Dou.github.io/CoSSegGaussians发布。

相关内容

Swift

关注 101

苹果公司在 WWDC 2014 开幕 Keynote 上发布的全新编程语言，具有更多现代化特性，同时容易使用，定位是补充 Objective-C. > Swift is an innovative new programming language for Cocoa and Cocoa Touch. Writing code is interactive and fun, the syntax is concise yet expressive, and apps run lightning-fast. Swift is ready for your next iOS and OS X project — or for addition into your current app — because Swift code works side-by-side with Objective-C.

Swift - Apple Developer

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务