Simulation engines are widely adopted in robotics. However, they lack either full simulation control, ROS integration, realistic physics, or photorealism. Recently, synthetic data generation and realistic rendering has advanced tasks like target tracking and human pose estimation. However, when focusing on vision applications, there is usually a lack of information like sensor measurements or time continuity. On the other hand, simulations for most robotics tasks are performed in (semi)static environments, with specific sensors and low visual fidelity. To solve this, we introduced in our previous work a fully customizable framework for generating realistic animated dynamic environments (GRADE) [1]. We use GRADE to generate an indoor dynamic environment dataset and then compare multiple SLAM algorithms on different sequences. By doing that, we show how current research over-relies on known benchmarks, failing to generalize. Our tests with refined YOLO and Mask R-CNN models provide further evidence that additional research in dynamic SLAM is necessary. The code, results, and generated data are provided as open-source at https://eliabntt.github.io/grade-rrSimulation of Dynamic Environments for SLAM
翻译:仿真引擎在机器人领域得到广泛应用,但现有方案存在不足:或缺乏完整仿真控制能力,或无法集成ROS系统,或物理引擎不够真实,或视觉保真度欠佳。近年来,合成数据生成与逼真渲染技术推动了目标跟踪和人体姿态估计等任务的发展。然而,针对视觉应用时,常缺乏传感器测量值或时间连续性等关键信息。另一方面,多数机器人任务的仿真环境多为(半)静态场景,传感器种类受限且视觉保真度较低。为解决上述问题,我们在前期工作中提出了可完全定制的动态环境逼真动画生成框架GRADE[1]。本研究利用GRADE生成了室内动态环境数据集,并在不同序列上对比了多种SLAM算法的性能。实验结果表明,当前研究过度依赖已知基准测试,导致算法泛化能力不足。通过优化YOLO与Mask R-CNN模型得到的测试结果进一步证实,动态SLAM领域亟需更深入的探索。相关代码、结果及生成数据已在https://eliabntt.github.io/grade-rr 以开源形式发布。