Comprehensive Training and Evaluation on Deep Reinforcement Learning for Automated Driving in Various Simulated Driving Maneuvers

Developing and testing automated driving models in the real world might be challenging and even dangerous, while simulation can help with this, especially for challenging maneuvers. Deep reinforcement learning (DRL) has the potential to tackle complex decision-making and controlling tasks through learning and interacting with the environment, thus it is suitable for developing automated driving while not being explored in detail yet. This study carried out a comprehensive study by implementing, evaluating, and comparing the two DRL algorithms, Deep Q-networks (DQN) and Trust Region Policy Optimization (TRPO), for training automated driving on the highway-env simulation platform. Effective and customized reward functions were developed and the implemented algorithms were evaluated in terms of onlane accuracy (how well the car drives on the road within the lane), efficiency (how fast the car drives), safety (how likely the car is to crash into obstacles), and comfort (how much the car makes jerks, e.g., suddenly accelerates or brakes). Results show that the TRPO-based models with modified reward functions delivered the best performance in most cases. Furthermore, to train a uniform driving model that can tackle various driving maneuvers besides the specific ones, this study expanded the highway-env and developed an extra customized training environment, namely, ComplexRoads, integrating various driving maneuvers and multiple road scenarios together. Models trained on the designed ComplexRoads environment can adapt well to other driving maneuvers with promising overall performance. Lastly, several functionalities were added to the highway-env to implement this work. The codes are open on GitHub at https://github.com/alaineman/drlcarsim-paper.

翻译：在真实世界中开发和测试自动驾驶模型可能面临挑战甚至危险，而仿真环境尤其有助于处理复杂驾驶场景。深度强化学习通过与环境的交互学习，能够应对复杂的决策与控制任务，因此适用于自动驾驶开发，但目前尚未得到深入探索。本研究通过实现、评估并比较深度Q网络与信任域策略优化两种深度强化学习算法，在highway-env仿真平台上开展自动驾驶训练的全面研究。我们开发了高效且定制化的奖励函数，从车道保持精度（车辆在车道内行驶的准确性）、效率（行驶速度）、安全性（碰撞障碍物的风险）和舒适性（车辆产生急动——如突然加速或制动——的程度）四个维度评估所实现算法。结果表明，基于信任域策略优化且采用改进奖励函数的模型在多数场景下表现最优。此外，为训练能处理特定驾驶场景外的多种驾驶模式的通用驾驶模型，本研究扩展了highway-env平台并开发了名为ComplexRoads的定制化训练环境，该环境融合了多种驾驶操作与多道路场景。在ComplexRoads环境上训练的模型能够良好适应其他驾驶操作并保持优异整体性能。最后，我们为highway-env平台添加了多项功能以支撑本工作。相关代码已在GitHub开源：https://github.com/alaineman/drlcarsim-paper。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日