OccRWKV: Rethinking Efficient 3D Semantic Occupancy Prediction with Linear Complexity

3D semantic occupancy prediction networks have demonstrated remarkable capabilities in reconstructing the geometric and semantic structure of 3D scenes, providing crucial information for robot navigation and autonomous driving systems. However, due to their large overhead from dense network structure designs, existing networks face challenges balancing accuracy and latency.In this paper, we introduce OccRWKV, an efficient semantic occupancy network inspired by Receptance Weighted Key Value (RWKV). OccRWKV separates semantics, occupancy prediction, and feature fusion into distinct branches, each incorporating Sem-RWKV and Geo-RWKV blocks. These blocks are designed to capture long-range dependencies, enabling the network to learn domain-specific representation (i.e., semantics and geometry), which enhances prediction accuracy. Leveraging the sparse nature of real-world 3D occupancy, we reduce computational overhead by projecting features into the bird's-eye view (BEV) space and propose a BEV-RWKV block for efficient feature enhancement and fusion. This enables real-time inference at 22.2 FPS without compromising performance. Experiments demonstrate that OccRWKV outperforms the state-of-the-art methods on the SemanticKITTI dataset, achieving a mIoU of 25.1 while being 20 times faster than the best baseline, Co-Occ, making it suitable for real-time deployment on robots to enhance autonomous navigation efficiency. Code and video are available on our project page: \url{https://jmwang0117.github.io/OccRWKV/}.

翻译：三维语义占据预测网络在重建三维场景的几何与语义结构方面展现出卓越能力，为机器人导航与自动驾驶系统提供了关键信息。然而，由于现有网络采用密集结构设计导致计算开销巨大，其在精度与延迟之间的平衡面临挑战。本文提出OccRWKV——一种受Receptance Weighted Key Value (RWKV)启发的高效语义占据网络。OccRWKV将语义预测、占据预测与特征融合解耦为独立分支，每个分支均包含Sem-RWKV与Geo-RWKV模块。这些模块专为捕获长程依赖而设计，使网络能够学习领域特定表征（即语义与几何信息），从而提升预测精度。利用真实世界三维占据的稀疏特性，我们通过将特征投影至鸟瞰图空间以降低计算开销，并提出BEV-RWKV模块以实现高效特征增强与融合。该设计使网络在保持性能的同时达到22.2 FPS的实时推理速度。实验表明，OccRWKV在SemanticKITTI数据集上优于现有最优方法，以25.1的mIoU指标实现超越，且推理速度比最佳基线Co-Occ快20倍，适用于机器人平台的实时部署以提升自主导航效率。代码与演示视频详见项目页面：\url{https://jmwang0117.github.io/OccRWKV/}。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日