While large language models (LLMs) offer promising reasoning capabilities, their integration into safety-critical driving systems is hindered by limited reasoning diversity, high computational overhead, and static learning paradigms. To address these challenges, we propose LUNA-AD, a lightweight uncertainty-aware language model with lifelong learning for autonomous driving (AD). LUNA-AD features a tri-system architecture that reconciles complex multimodal behavioral reasoning, efficient deployment, and continual refinement. We design a multi-agent analytical system to generate uncertainty-aware decision-making demonstrations through diverse hypothesis exploration. A dual-head lightweight heuristic model is distilled to unify the inference of decision distributions and textual explanations while enabling efficient deployment. Furthermore, a reflection-driven lifelong learning mechanism operates on multimodal decision outputs and preserves strategic diversity, allowing for the refinement of candidate decisions and rationales via closed-loop feedback to enhance driving robustness. Extensive experiments on nuPlan benchmarks demonstrate that LUNA-AD achieves state-of-the-art success rates under both non-reactive and reactive modes, with drastically reduced inference latency compared to existing knowledge-driven AD frameworks.
翻译:尽管大型语言模型(LLMs)在推理能力方面展现出巨大潜力,但其在安全关键的驾驶系统中的集成受到推理多样性不足、计算开销高以及静态学习范式的制约。为应对这些挑战,我们提出LUNA-AD——一种面向自动驾驶(AD)的轻量级不确定性感知语言模型,具备终身学习能力。LUNA-AD采用三系统架构,能够协调复杂的多模态行为推理、高效部署与持续优化。我们设计了一套多智能体分析系统,通过多样化假设探索生成不确定性感知的决策示范。通过蒸馏训练,我们得到一种双头轻量级启发式模型,统一了决策分布与文本解释的推理过程,同时实现高效部署。此外,基于反思驱动的终身学习机制作用于多模态决策输出并保持策略多样性,通过闭环反馈优化候选决策及其依据,从而增强驾驶鲁棒性。在nuPlan基准上的大量实验表明,LUNA-AD在非反应式和反应式两种模式下均达到了最优成功率,且推理延迟较现有知识驱动型AD框架显著降低。