Autonomous driving technology, a catalyst for revolutionizing transportation and urban mobility, has the tend to transition from rule-based systems to data-driven strategies. Traditional module-based systems are constrained by cumulative errors among cascaded modules and inflexible pre-set rules. In contrast, end-to-end autonomous driving systems have the potential to avoid error accumulation due to their fully data-driven training process, although they often lack transparency due to their ``black box" nature, complicating the validation and traceability of decisions. Recently, large language models (LLMs) have demonstrated abilities including understanding context, logical reasoning, and generating answers. A natural thought is to utilize these abilities to empower autonomous driving. By combining LLM with foundation vision models, it could open the door to open-world understanding, reasoning, and few-shot learning, which current autonomous driving systems are lacking. In this paper, we systematically review a research line about \textit{Large Language Models for Autonomous Driving (LLM4AD)}. This study evaluates the current state of technological advancements, distinctly outlining the principal challenges and prospective directions for the field. For the convenience of researchers in academia and industry, we provide real-time updates on the latest advances in the field as well as relevant open-source resources via the designated link: https://github.com/Thinklab-SJTU/Awesome-LLM4AD.
翻译:自动驾驶技术作为推动交通与城市出行变革的催化剂,正从基于规则的系统向数据驱动策略转型。传统模块化系统受限于级联模块间的累积误差与僵化的预设规则。相比之下,端到端自动驾驶系统因其完全数据驱动的训练过程具有避免误差累积的潜力,但受"黑箱"特性影响,往往缺乏透明性,导致决策验证与可追溯性复杂化。近年来,大型语言模型展现出包括上下文理解、逻辑推理和答案生成在内的能力。一个自然的思路是利用这些能力赋能自动驾驶。通过将大型语言模型与基础视觉模型结合,有望开启当前自动驾驶系统所欠缺的开放世界理解、推理和少样本学习能力。本文系统梳理了面向自动驾驶的大型语言模型研究脉络,评估了当前技术发展水平,明确界定了该领域的主要挑战与未来方向。为方便学术界与工业界研究者,我们通过指定链接(https://github.com/Thinklab-SJTU/Awesome-LLM4AD)实时更新该领域最新进展及相关开源资源。