In recent years, computer vision tasks like target tracking and human pose estimation have immensely benefited from synthetic data generation and novel rendering techniques. On the other hand, methods in robotics, especially for robot perception, have been slow to leverage these techniques. This is because state-of-the-art simulation frameworks for robotics lack either complete control, integration with the Robot Operating System (ROS), realistic physics or photorealism. To solve this, we present a fully customizable framework for generating realistic animated dynamic environments (GRADE) for robotics research, focused primarily at robot perception. The framework can be used either to generate ground truth data for robotic vision-related tasks and offline processing, or to experiment with robots online in dynamic environments. We build upon the Nvidia Isaac Sim to allow control of custom robots. We provide methods to include assets, populate and control the simulation, and process the data. Using autonomous robots in GRADE, we generate video datasets of an indoor dynamic environment. First, we use it to demonstrate the framework's visual realism by evaluating the sim-to-real gap through experiments with YOLO and Mask R-CNN. Second, we benchmark dynamic SLAM algorithms with this dataset. This not only shows that GRADE can significantly improve training performance and generalization to real sequences, but also highlights how current dynamic SLAM methods over-rely on known benchmarks, failing to generalize. We also introduce a method to precisely repeat a previously recorded experiment, while allowing changes in the surroundings of the robot. Code and data are provided as open-source at https://grade.is.tue.mpg.de.
翻译:近年来,目标跟踪和人体姿态估计等计算机视觉任务受益于合成数据生成和新颖渲染技术的显著进步。然而,机器人领域(尤其是机器人感知方法)在利用这些技术方面进展缓慢。这是因为当前最先进的机器人仿真框架要么缺乏完整控制、与机器人操作系统(ROS)的集成,要么缺乏真实物理或照片级渲染。为解决这一问题,我们提出了一种面向机器人研究的全可定制框架GRADE(生成逼真动态环境),主要聚焦于机器人感知领域。该框架既可用于生成机器人视觉相关任务的真实标注数据并支持离线处理,也可用于在动态环境中对机器人进行在线实验。我们基于Nvidia Isaac Sim构建系统,支持控制自定义机器人。我们提供了引入资产、填充并控制仿真环境以及处理数据的方法。通过GRADE中的自主机器人,我们生成了室内动态环境的视频数据集。首先,我们利用YOLO和Mask R-CNN进行仿真到真实差距评估实验,证明了框架的视觉逼真度。其次,我们使用该数据集对动态SLAM算法进行基准测试。这不仅表明GRADE能显著提升训练性能及对真实序列的泛化能力,还揭示了当前动态SLAM方法过度依赖已知基准而难以泛化的缺陷。此外,我们提出了一种方法,可在允许改变机器人周围环境的同时精确复现先前记录的实验。代码和数据以开源形式发布于https://grade.is.tue.mpg.de。