Autonomous driving technology, a catalyst for revolutionizing transportation and urban mobility, has the tend to transition from rule-based systems to data-driven strategies. Traditional module-based systems are constrained by cumulative errors among cascaded modules and inflexible pre-set rules. In contrast, end-to-end autonomous driving systems have the potential to avoid error accumulation due to their fully data-driven training process, although they often lack transparency due to their "black box" nature, complicating the validation and traceability of decisions. Recently, large language models (LLMs) have demonstrated abilities including understanding context, logical reasoning, and generating answers. A natural thought is to utilize these abilities to empower autonomous driving. By combining LLM with foundation vision models, it could open the door to open-world understanding, reasoning, and few-shot learning, which current autonomous driving systems are lacking. In this paper, we systematically review a research line about \textit{Large Language Models for Autonomous Driving (LLM4AD)}. This study evaluates the current state of technological advancements, distinctly outlining the principal challenges and prospective directions for the field. For the convenience of researchers in academia and industry, we provide real-time updates on the latest advances in the field as well as relevant open-source resources via the designated link: https://github.com/Thinklab-SJTU/Awesome-LLM4AD.
翻译:自动驾驶技术作为推动交通和城市出行变革的催化剂,正从基于规则的系统向数据驱动策略转变。传统模块化系统受限于级联模块间的累积误差和僵化的预设规则。相比之下,端到端自动驾驶系统因其完全数据驱动的训练过程而具有避免误差累积的潜力,但由于其"黑箱"特性往往缺乏透明度,导致决策验证和可追溯性困难。近年来,大语言模型(LLMs)展现出理解上下文、逻辑推理和生成答案的能力。自然想到利用这些能力赋能自动驾驶。通过将LLM与基础视觉模型相结合,可开启当前自动驾驶系统所缺乏的开放世界理解、推理和少样本学习之门。本文系统综述了"大语言模型在自动驾驶中应用(LLM4AD)"这一研究方向。本研究评估了当前技术发展现状,明确阐述了该领域的主要挑战和未来方向。为方便学术界和工业界研究人员,我们通过指定链接(https://github.com/Thinklab-SJTU/Awesome-LLM4AD)实时更新该领域最新进展及相关开源资源。