The problem with existing camera-based Deep Reinforcement Learning approaches is twofold: they rarely integrate high-level scene context into the feature representation, and they rely on rigid, fixed reward functions. To address these challenges, this paper proposes a novel pipeline that produces a neuro-symbolic feature representation that encompasses semantic, spatial, and shape information, as well as spatially boosted features of dynamic entities in the scene, with an emphasis on safety-critical road users. It also proposes a Soft First-Order Logic (SFOL) reward function that balances human values via a symbolic reasoning module. Here, semantic and spatial predicates are extracted from segmentation maps and applied to linguistic rules to obtain reward weights. Quantitative experiments in the CARLA simulation environment show that the proposed neuro-symbolic representation and SFOL reward function improved policy robustness and safety-related performance metrics compared to baseline representations and reward formulations across varying traffic densities and occlusion levels. The findings demonstrate that integrating holistic representations and soft reasoning into Reinforcement Learning can support more context-aware and value-aligned decision-making for autonomous driving.
翻译:现有基于摄像头的深度强化学习方法存在双重问题:它们很少将高层场景上下文整合到特征表征中,且依赖于僵化的固定奖励函数。为应对这些挑战,本文提出一种新型处理流程,能够生成包含语义、空间与形状信息以及场景中动态实体空间增强特征的神经符号特征表征,并重点关注安全关键道路使用者。同时提出一种软一阶逻辑奖励函数,通过符号推理模块平衡人类价值。该方法从分割图中提取语义与空间谓词,并将其应用于语言规则以获得奖励权重。在CARLA仿真环境中的定量实验表明,相较于基线表征与奖励方案,所提出的神经符号表征与软一阶逻辑奖励函数在不同交通密度与遮挡水平下均提升了策略鲁棒性及安全相关性能指标。研究结果表明,将整体性表征与软推理整合至强化学习中,能够为自动驾驶提供更具情境感知能力且符合价值取向的决策支持。