Human-level driving is an ultimate goal of autonomous driving. Conventional approaches formulate autonomous driving as a perception-prediction-planning framework, yet their systems do not capitalize on the inherent reasoning ability and experiential knowledge of humans. In this paper, we propose a fundamental paradigm shift from current pipelines, exploiting Large Language Models (LLMs) as a cognitive agent to integrate human-like intelligence into autonomous driving systems. Our approach, termed Agent-Driver, transforms the traditional autonomous driving pipeline by introducing a versatile tool library accessible via function calls, a cognitive memory of common sense and experiential knowledge for decision-making, and a reasoning engine capable of chain-of-thought reasoning, task planning, motion planning, and self-reflection. Powered by LLMs, our Agent-Driver is endowed with intuitive common sense and robust reasoning capabilities, thus enabling a more nuanced, human-like approach to autonomous driving. We evaluate our approach on the large-scale nuScenes benchmark, and extensive experiments substantiate that our Agent-Driver significantly outperforms the state-of-the-art driving methods by a large margin. Our approach also demonstrates superior interpretability and few-shot learning ability to these methods. Project page: \href{https://github.com/USC-GVL/Agent-Driver/blob/main/index.html}{here}.
翻译:人类级驾驶是自动驾驶的终极目标。传统方法将自动驾驶构建为感知-预测-规划框架,但这些系统未能充分利用人类固有的推理能力和经验知识。本文提出一种根本性的范式转变,利用大型语言模型作为认知智能体,将类人智能融入自动驾驶系统。我们的方法名为Agent-Driver,通过引入可通过函数调用的多功能工具库、用于决策的常识与经验知识认知记忆模块,以及具备思维链推理、任务规划、运动规划和自我反思能力的推理引擎,彻底革新了传统自动驾驶流程。依托大型语言模型的驱动,Agent-Driver具备直观的常识和强大的推理能力,从而实现了更加精细、类人的自动驾驶方式。我们在大规模nuScenes基准上评估该方法,大量实验证明Agent-Driver以显著优势超越当前最先进的驾驶方法。同时,我们的方法展现出优于这些方法的可解释性和少样本学习能力。项目页面:\href{https://github.com/USC-GVL/Agent-Driver/blob/main/index.html}{此处}。