CoSSegGaussians: Compact and Swift Scene Segmenting 3D Gaussians with Dual Feature Fusion

We propose Compact and Swift Segmenting 3D Gaussians(CoSSegGaussians), a method for compact 3D-consistent scene segmentation at fast rendering speed with only RGB images input. Previous NeRF-based segmentation methods have relied on time-consuming neural scene optimization. While recent 3D Gaussian Splatting has notably improved speed, existing Gaussian-based segmentation methods struggle to produce compact masks, especially in zero-shot segmentation. This issue probably stems from their straightforward assignment of learnable parameters to each Gaussian, resulting in a lack of robustness against cross-view inconsistent 2D machine-generated labels. Our method aims to address this problem by employing Dual Feature Fusion Network as Gaussians' segmentation field. Specifically, we first optimize 3D Gaussians under RGB supervision. After Gaussian Locating, DINO features extracted from images are applied through explicit unprojection, which are further incorporated with spatial features from the efficient point cloud processing network. Feature aggregation is utilized to fuse them in a global-to-local strategy for compact segmentation features. Experimental results show that our model outperforms baselines on both semantic and panoptic zero-shot segmentation task, meanwhile consumes less than 10% inference time compared to NeRF-based methods. Code and more results will be available at https://David-Dou.github.io/CoSSegGaussians

翻译：本文提出紧凑快速场景分割三维高斯方法（CoSSegGaussians），一种仅需RGB图像输入即可实现三维一致场景紧凑分割且渲染速度极快的技术。现有基于NeRF的分割方法依赖耗时的神经场景优化。尽管近期三维高斯泼溅技术显著提升了速度，但现有基于高斯的分割方法难以生成紧凑掩膜，尤其在零样本分割任务中。该问题可能源于其直接为每个高斯分配可学习参数的方式，导致对跨视角不一致的二维机器生成标签缺乏鲁棒性。本文通过采用双特征融合网络作为高斯的分割场来解决该问题：首先在RGB监督下优化三维高斯；在高斯定位后，通过显式反投影应用图像提取的DINO特征，并与高效点云处理网络的空间特征融合；采用全局到局部策略进行特征聚合以获取紧凑分割特征。实验结果表明，本模型在语义分割和全景分割的零样本任务上均优于基线方法，同时推理时间相比基于NeRF的方法减少90%以上。代码及更多结果将在https://David-Dou.github.io/CoSSegGaussians 公开。

相关内容

Swift

关注 101

苹果公司在 WWDC 2014 开幕 Keynote 上发布的全新编程语言，具有更多现代化特性，同时容易使用，定位是补充 Objective-C. > Swift is an innovative new programming language for Cocoa and Cocoa Touch. Writing code is interactive and fun, the syntax is concise yet expressive, and apps run lightning-fast. Swift is ready for your next iOS and OS X project — or for addition into your current app — because Swift code works side-by-side with Objective-C.

Swift - Apple Developer

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务