Autonomous Vehicle (AV) decision making in urban environments is inherently challenging due to the dynamic interactions with surrounding vehicles. For safe planning, AV must understand the weightage of various spatiotemporal interactions in a scene. Contemporary works use colossal transformer architectures to encode interactions mainly for trajectory prediction, resulting in increased computational complexity. To address this issue without compromising spatiotemporal understanding and performance, we propose the simple Deep Attention Driven Reinforcement Learning (DADRL) framework, which dynamically assigns and incorporates the significance of surrounding vehicles into the ego's RL driven decision making process. We introduce an AV centric spatiotemporal attention encoding (STAE) mechanism for learning the dynamic interactions with different surrounding vehicles. To understand map and route context, we employ a context encoder to extract features from context maps. The spatiotemporal representations combined with contextual encoding provide a comprehensive state representation. The resulting model is trained using the Soft Actor Critic (SAC) algorithm. We evaluate the proposed framework on the SMARTS urban benchmarking scenarios without traffic signals to demonstrate that DADRL outperforms recent state of the art methods. Furthermore, an ablation study underscores the importance of the context-encoder and spatio temporal attention encoder in achieving superior performance.
翻译:在城市环境中,自动驾驶车辆(AV)的决策因与周围车辆的动态交互而具有固有挑战性。为实现安全规划,AV必须理解场景中各种时空交互的重要性。现有研究主要采用庞大的Transformer架构来编码交互以进行轨迹预测,这导致计算复杂度显著增加。为解决此问题,同时不损害时空理解与性能,我们提出了简洁的深度注意力驱动强化学习(DAD-RL)框架,该框架能动态分配并整合周围车辆的重要性到基于强化学习的自主决策过程中。我们引入了一种以AV为中心的时空注意力编码(STAE)机制,用于学习与不同周围车辆的动态交互。为理解地图与路径上下文,我们采用上下文编码器从上下文地图中提取特征。时空表征与上下文编码相结合,提供了全面的状态表示。最终模型使用软演员-评论家(SAC)算法进行训练。我们在无交通信号的SMARTS城市基准场景中评估所提框架,结果表明DAD-RL优于当前最先进方法。此外,消融实验证实了上下文编码器与时空注意力编码器对于实现优越性能的重要性。