Deep reinforcement learning (DRL) is currently the most popular AI-based approach to autonomous vehicle control. An agent, trained for this purpose in simulation, can interact with the real environment with a human-level performance. Despite very good results in terms of selected metrics, this approach has some significant drawbacks: high computational requirements and low explainability. Because of that, a DRL-based agent cannot be used in some control tasks, especially when safety is the key issue. Therefore we propose to use Tangled Program Graphs (TPGs) as an alternative for deep reinforcement learning in control-related tasks. In this approach, input signals are processed by simple programs that are combined in a graph structure. As a result, TPGs are less computationally demanding and their actions can be explained based on the graph structure. In this paper, we present our studies on the use of TPGs as an alternative for DRL in control-related tasks. In particular, we consider the problem of navigating an unmanned aerial vehicle (UAV) through the unknown environment based solely on the on-board LiDAR sensor. The results of our work show promising prospects for the use of TPGs in control related-tasks.
翻译:深度强化学习(DRL)是目前最流行的基于人工智能的自主载具控制方法。在仿真环境中训练用于此目的的智能体,能够以人类水平的性能与现实环境交互。尽管在选定指标上取得了优异成果,该方法仍存在显著缺陷:计算需求高且可解释性低。因此,基于DRL的智能体无法应用于某些控制任务,尤其是在安全性为核心问题的场景中。为此,我们提出将纠缠程序图(TPG)作为深度强化学习在控制相关任务中的替代方案。该方法通过图结构组合的简单程序处理输入信号,使得TPG具有较低的计算需求,且其行为可基于图结构进行解释。本文阐述了TPG作为DRL替代方案在控制任务中的应用研究,特别探讨了无人机(UAV)仅依赖机载激光雷达传感器在未知环境中的导航问题。研究结果表明,TPG在控制相关任务中具有广阔的应用前景。