Autonomous vehicles trained through Multi-Agent Reinforcement Learning (MARL) have shown impressive results in many driving scenarios. However, the performance of these trained policies can be impacted when faced with diverse driving styles and personalities, particularly in highly interactive situations. This is because conventional MARL algorithms usually operate under the assumption of fully cooperative behavior among all agents and focus on maximizing team rewards during training. To address this issue, we introduce the Personality Modeling Network (PeMN), which includes a cooperation value function and personality parameters to model the varied interactions in high-interactive scenarios. The PeMN also enables the training of a background traffic flow with diverse behaviors, thereby improving the performance and generalization of the ego vehicle. Our extensive experimental studies, which incorporate different personality parameters in high-interactive driving scenarios, demonstrate that the personality parameters effectively model diverse driving styles and that policies trained with PeMN demonstrate better generalization compared to traditional MARL methods.
翻译:通过多智能体强化学习训练的自动驾驶车辆已在多种驾驶场景中展现出显著成果。然而,当面对多样化的驾驶风格和人格特征时,尤其是在高度交互情境下,这些训练策略的性能可能会受到影响。这是因为传统多智能体强化学习算法通常假设所有智能体之间完全合作,并在训练过程中专注于最大化团队奖励。为解决此问题,我们引入人格建模网络,该网络包含合作价值函数和人格参数,用于建模高度交互场景中的多样化互动。人格建模网络还能训练具有多样化行为的背景交通流,从而提升自车的性能与泛化能力。我们在高度交互驾驶场景中融入不同人格参数进行的广泛实验研究表明,人格参数能有效建模多样化驾驶风格,且通过人格建模网络训练的策略相比传统多智能体强化学习方法展现出更强的泛化能力。