In the domain of autonomous vehicles (AVs), decision-making is a critical factor that significantly influences the efficacy of autonomous navigation. As the field progresses, the enhancement of decision-making capabilities in complex environments has become a central area of research within data-driven methodologies. Despite notable advances, existing learning-based decision-making strategies in autonomous vehicles continue to reveal opportunities for further refinement, particularly in the articulation of policies and the assurance of safety. In this study, the decision-making challenges associated with autonomous vehicles are conceptualized through the framework of the Constrained Markov Decision Process (CMDP) and approached as a sequence modeling problem. Utilizing the Generative Pre-trained Transformer (GPT), we introduce a novel decision-making model tailored for AVs, which incorporates entropy regularization techniques to bolster exploration and enhance safety performance. Comprehensive experiments conducted across various scenarios affirm that our approach surpasses several established baseline methods, particularly in terms of safety and overall efficacy.
翻译:在自动驾驶领域,决策是显著影响自主导航效能的关键因素。随着该领域的发展,复杂环境下决策能力的提升已成为数据驱动方法研究的核心方向。尽管已取得显著进展,现有基于学习的自动驾驶决策策略仍显示出进一步优化的空间,尤其在策略表达与安全保障方面。本研究通过约束马尔可夫决策过程框架对自动驾驶相关决策挑战进行建模,并将其视为序列建模问题。利用生成式预训练Transformer,我们提出了一种专为自动驾驶设计的创新决策模型,该模型融合熵正则化技术以增强探索能力并提升安全性能。在不同场景下进行的综合实验证实,我们的方法在安全性与整体效能方面均优于多种现有基准方法。