Embodied agents require robust navigation systems to operate in unstructured environments, making the robustness of Simultaneous Localization and Mapping (SLAM) models critical to embodied agent autonomy. While real-world datasets are invaluable, simulation-based benchmarks offer a scalable approach for robustness evaluations. However, the creation of a challenging and controllable noisy world with diverse perturbations remains under-explored. To this end, we propose a novel, customizable pipeline for noisy data synthesis, aimed at assessing the resilience of multi-modal SLAM models against various perturbations. The pipeline comprises a comprehensive taxonomy of sensor and motion perturbations for embodied multi-modal (specifically RGB-D) sensing, categorized by their sources and propagation order, allowing for procedural composition. We also provide a toolbox for synthesizing these perturbations, enabling the transformation of clean environments into challenging noisy simulations. Utilizing the pipeline, we instantiate the large-scale Noisy-Replica benchmark, which includes diverse perturbation types, to evaluate the risk tolerance of existing advanced RGB-D SLAM models. Our extensive analysis uncovers the susceptibilities of both neural (NeRF and Gaussian Splatting -based) and non-neural SLAM models to disturbances, despite their demonstrated accuracy in standard benchmarks. Our code is publicly available at https://github.com/Xiaohao-Xu/SLAM-under-Perturbation.
翻译:具身智能体需要在非结构化环境中稳定运行,这要求同步定位与建图(SLAM)模型具备强鲁棒性,该特性对智能体自主性至关重要。尽管真实世界数据集具有不可替代的价值,基于仿真的基准测试为鲁棒性评估提供了可扩展的解决方案。然而,如何构建兼具挑战性与可控性的多扰动噪声世界仿真环境仍待深入探索。为此,我们提出了一种新颖的可定制化噪声数据合成流程,旨在系统评估多模态SLAM模型面对各类扰动的抗干扰能力。该流程构建了面向具身多模态(特指RGB-D)感知的传感器与运动扰动综合分类体系,依据扰动来源与传播顺序进行层级化归类,支持流程化组合生成。我们同时提供了配套的扰动合成工具箱,能够将洁净仿真环境转化为高挑战性的噪声仿真场景。基于此流程,我们实例化了包含多类型扰动的大规模Noisy-Replica基准数据集,用于系统评估现有先进RGB-D SLAM模型的风险容忍度。大量实验分析揭示了神经式(基于NeRF与高斯溅射)与非神经式SLAM模型在标准基准测试中虽表现精准,却对各类扰动存在显著脆弱性。相关代码已开源:https://github.com/Xiaohao-Xu/SLAM-under-Perturbation。