We present LaMPilot, a novel framework for planning in the field of autonomous driving, rethinking the task as a code-generation process that leverages established behavioral primitives. This approach aims to address the challenge of interpreting and executing spontaneous user instructions such as "overtake the car ahead," which have typically posed difficulties for existing frameworks. We introduce the LaMPilot benchmark specifically designed to quantitatively evaluate the efficacy of Large Language Models (LLMs) in translating human directives into actionable driving policies. We then evaluate a wide range of state-of-the-art code generation language models on tasks from the LaMPilot Benchmark. The results of the experiments showed that GPT-4, with human feedback, achieved an impressive task completion rate of 92.7% and a minimal collision rate of 0.9%. To encourage further investigation in this area, our code and dataset will be made available.
翻译:我们提出LaMPilot,一种自动驾驶规划领域的新型框架,将任务重新构想为利用既定行为原语的代码生成过程。该方法旨在解决"超越前车"等自发性用户指令的解析与执行挑战——此类指令在现有框架中通常难以处理。我们设计了LaMPilot基准测试,专门用于定量评估大语言模型将人类指令转化为可执行驾驶策略的有效性。随后,我们在LaMPilot基准任务上评估了多种先进代码生成语言模型。实验结果表明,GPT-4在人类反馈机制下实现了92.7%的任务完成率以及0.9%的最低碰撞率。为促进该领域的进一步研究,我们将公开相关代码和数据集。