This work presents a modular and parallelizable multi-agent deep reinforcement learning framework for imbibing cooperative as well as competitive behaviors within autonomous vehicles. We introduce AutoDRIVE Ecosystem as an enabler to develop physically accurate and graphically realistic digital twins of Nigel and F1TENTH, two scaled autonomous vehicle platforms with unique qualities and capabilities, and leverage this ecosystem to train and deploy multi-agent reinforcement learning policies. We first investigate an intersection traversal problem using a set of cooperative vehicles (Nigel) that share limited state information with each other in single as well as multi-agent learning settings using a common policy approach. We then investigate an adversarial head-to-head autonomous racing problem using a different set of vehicles (F1TENTH) in a multi-agent learning setting using an individual policy approach. In either set of experiments, a decentralized learning architecture was adopted, which allowed robust training and testing of the approaches in stochastic environments, since the agents were mutually independent and exhibited asynchronous motion behavior. The problems were further aggravated by providing the agents with sparse observation spaces and requiring them to sample control commands that implicitly satisfied the imposed kinodynamic as well as safety constraints. The experimental results for both problem statements are reported in terms of quantitative metrics and qualitative remarks for training as well as deployment phases.
翻译:本文提出了一个模块化且可并行化的多智能体深度强化学习框架,用于在自主车辆中注入合作与竞争行为。我们引入AutoDRIVE生态系统,为两个具有独特性能与能力的缩比自主车辆平台Nigel和F1TENTH开发物理精确且图形逼真的数字孪生体,并利用该生态系统训练与部署多智能体强化学习策略。首先,我们采用一组合作车辆(Nigel)通过共享有限状态信息,在单智能体与多智能体学习设置下使用共同策略方法研究交叉路口通行问题。随后,我们采用另一组车辆(F1TENTH)在多智能体学习设置下使用独立策略方法研究对抗性头对头自主赛车问题。在两套实验中均采用分散式学习架构,由于智能体相互独立且具有异步运动行为,该方法可在随机环境中实现稳健的训练与测试。进一步通过向智能体提供稀疏观测空间并要求其采样隐式满足运动动力学与安全约束的控制指令,增加了问题难度。本文通过定量指标与定性评价报告了两类问题在训练与部署阶段的实验结果。