Deep Reinforcement Learning (DRL) enables cognitive Autonomous Ground Vehicle (AGV) navigation utilizing raw sensor data without a-priori maps or GPS, which is a necessity in hazardous, information poor environments such as regions where natural disasters occur, and extraterrestrial planets. The substantial training time required to learn an optimal DRL policy, which can be days or weeks for complex tasks, is a major hurdle to real-world implementation in AGV applications. Training entails repeated collisions with the surrounding environment over an extended time period, dependent on the complexity of the task, to reinforce positive exploratory, application specific behavior that is expensive, and time consuming in the real-world. Effectively bridging the simulation to real-world gap is a requisite for successful implementation of DRL in complex AGV applications, enabling learning of cost-effective policies. We present AutoVRL, an open-source high fidelity simulator built upon the Bullet physics engine utilizing OpenAI Gym and Stable Baselines3 in PyTorch to train AGV DRL agents for sim-to-real policy transfer. AutoVRL is equipped with sensor implementations of GPS, IMU, LiDAR and camera, actuators for AGV control, and realistic environments, with extensibility for new environments and AGV models. The simulator provides access to state-of-the-art DRL algorithms, utilizing a python interface for simple algorithm and environment customization, and simulation execution.
翻译:深度强化学习(DRL)使认知型自动驾驶地面车辆(AGV)能够利用原始传感器数据进行导航,而无需预先地图或GPS——这在自然灾害发生区域及地外行星等危险、信息匮乏的环境中至关重要。在AGV应用中,学习最优DRL策略所需的大量训练时间(复杂任务可能长达数天或数周)是实现真实部署的主要障碍。训练需要根据任务复杂度在长时间内反复与环境发生碰撞,以强化正向探索性的应用特定行为,这在现实世界中成本高昂且耗时。有效弥合仿真与现实之间的差距是实现AGV复杂应用中DRL成功部署的关键,从而能够学习到经济高效的策略。我们提出AutoVRL——一个基于Bullet物理引擎构建的开源高保真模拟器,利用OpenAI Gym和基于PyTorch的Stable Baselines3来训练AGV DRL智能体以实现仿真到真实的策略迁移。AutoVRL配备了GPS、IMU、LiDAR和相机等传感器实现,用于AGV控制的执行器,以及逼真的环境,并支持扩展新环境和AGV模型。该模拟器可通过Python接口访问最先进的DRL算法,便于进行算法和环境定制以及仿真执行。