Existing learning-based autonomous driving (AD) systems face challenges in comprehending high-level information, generalizing to rare events, and providing interpretability. To address these problems, this work employs Large Language Models (LLMs) as a decision-making component for complex AD scenarios that require human commonsense understanding. We devise cognitive pathways to enable comprehensive reasoning with LLMs, and develop algorithms for translating LLM decisions into actionable driving commands. Through this approach, LLM decisions are seamlessly integrated with low-level controllers by guided parameter matrix adaptation. Extensive experiments demonstrate that our proposed method not only consistently surpasses baseline approaches in single-vehicle tasks, but also helps handle complex driving behaviors even multi-vehicle coordination, thanks to the commonsense reasoning capabilities of LLMs. This paper presents an initial step toward leveraging LLMs as effective decision-makers for intricate AD scenarios in terms of safety, efficiency, generalizability, and interoperability. We aspire for it to serve as inspiration for future research in this field. Project page: https://sites.google.com/view/llm-mpc
翻译:现有基于学习的自动驾驶系统在理解高级语义信息、泛化至罕见事件以及提供可解释性方面面临挑战。为解决这些问题,本研究将大语言模型作为复杂自动驾驶场景中需要人类常识理解的决策组件。我们设计了认知路径以实现大语言模型的全面推理,并开发了将大语言模型决策转化为可执行驾驶指令的算法。通过该方法,大语言模型的决策通过引导参数矩阵自适应与底层控制器无缝集成。大量实验表明,我们提出的方法不仅在单车任务中持续超越基线方法,而且得益于大语言模型的常识推理能力,还能有效处理复杂的驾驶行为(甚至多车协同)。本文作为初步探索,展示了将大语言模型用作复杂自动驾驶场景中有效决策者的潜力——在安全性、效率、泛化性和可解释性方面均取得突破。我们期待该工作能为该领域的未来研究提供启发。项目页面:https://sites.google.com/view/llm-mpc