SeMLaPS: Real-time Semantic Mapping with Latent Prior Networks and Quasi-Planar Segmentation

The availability of real-time semantics greatly improves the core geometric functionality of SLAM systems, enabling numerous robotic and AR/VR applications. We present a new methodology for real-time semantic mapping from RGB-D sequences that combines a 2D neural network and a 3D network based on a SLAM system with 3D occupancy mapping. When segmenting a new frame we perform latent feature re-projection from previous frames based on differentiable rendering. Fusing re-projected feature maps from previous frames with current-frame features greatly improves image segmentation quality, compared to a baseline that processes images independently. For 3D map processing, we propose a novel geometric quasi-planar over-segmentation method that groups 3D map elements likely to belong to the same semantic classes, relying on surface normals. We also describe a novel neural network design for lightweight semantic map post-processing. Our system achieves state-of-the-art semantic mapping quality within 2D-3D networks-based systems and matches the performance of 3D convolutional networks on three real indoor datasets, while working in real-time. Moreover, it shows better cross-sensor generalization abilities compared to 3D CNNs, enabling training and inference with different depth sensors. Code and data will be released on project page: http://jingwenwang95.github.io/SeMLaPS

翻译：实时语义信息的可用性极大提升了SLAM系统的核心几何功能，为众多机器人及AR/VR应用提供了支撑。我们提出了一种基于RGB-D序列进行实时语义建图的新方法，该方法将二维神经网络与基于三维占位建图SLAM系统的三维网络相结合。在分割新帧时，我们基于可微渲染对历史帧进行隐式特征重投影。与独立处理图像的基线方法相比，将重投影特征图与当前帧特征融合可显著提升图像分割质量。针对三维地图处理，我们提出了一种新颖的几何准平面超分割方法，该方法基于表面法向将可能属于相同语义类的三维地图元素进行分组。同时，我们设计了一种用于轻量级语义地图后处理的新型神经网络架构。在基于二维-三维网络的系统中，本系统达到了最先进的语义建图质量，并在三个真实室内数据集上实现了与三维卷积网络相当的性能，同时保持实时运行。此外，与三维卷积网络相比，本系统展现出更优的跨传感器泛化能力，支持不同深度传感器的训练与推理。代码与数据将在项目页面发布：http://jingwenwang95.github.io/SeMLaPS

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日