Simultaneous Localization and Mapping (SLAM) plays an important role in many robotics fields, including social robots. Many of the available visual SLAM methods are based on the assumption of a static world and struggle in dynamic environments. In the current study, we introduce a real-time semantic RGBD SLAM approach designed specifically for dynamic environments. Our proposed system can effectively detect moving objects and maintain a static map to ensure robust camera tracking. The key innovation of our approach is the incorporation of deep learning-based semantic information into SLAM systems to mitigate the impact of dynamic objects. Additionally, we enhance the semantic segmentation process by integrating an Extended Kalman filter to identify dynamic objects that may be temporarily idle. We have also implemented a generative network to fill in the missing regions of input images belonging to dynamic objects. This highly modular framework has been implemented on the ROS platform and can achieve around 22 fps on a GTX1080. Benchmarking the developed pipeline on dynamic sequences from the TUM dataset suggests that the proposed approach delivers competitive localization error in comparison with the state-of-the-art methods, all while operating in near real-time. The source code is publicly available.
翻译:同步定位与建图(SLAM)在包括社交机器人在内的众多机器人领域扮演着重要角色。现有的大多数视觉SLAM方法基于静态世界假设,在动态环境中表现不佳。本研究提出一种专为动态环境设计的实时语义RGBD SLAM方法。我们提出的系统能够有效检测运动物体并维持静态地图,从而确保稳健的相机跟踪。本方法的核心创新在于将基于深度学习的语义信息融入SLAM系统,以降低动态物体带来的影响。此外,我们通过集成扩展卡尔曼滤波器来识别可能暂时静止的动态物体,从而增强了语义分割过程。我们还实现了生成网络来填补输入图像中属于动态物体的缺失区域。这一高度模块化的框架已在ROS平台上实现,在GTX1080显卡上可达到约22帧/秒的处理速度。通过在TUM数据集的动态序列上对开发流程进行基准测试表明,与现有先进方法相比,所提出的方法在保持接近实时运行的同时,能够实现具有竞争力的定位误差。源代码已公开。