Autonomous vehicles (AVs) require adaptive behavior planners to navigate unpredictable, real-world environments safely. Traditional behavior trees (BTs) offer structured decision logic but are inherently static and demand labor-intensive manual tuning, limiting their applicability at SAE Level 5 autonomy. This paper presents an agentic framework that leverages large language models (LLMs) and multi-modal vision models (LVMs) to generate and adapt BTs on the fly. A specialized Descriptor agent applies chain-of-symbols prompting to assess scene criticality, a Planner agent constructs high-level sub-goals via in-context learning, and a Generator agent synthesizes executable BT sub-trees in XML format. Integrated into a CARLA+Nav2 simulation, our system triggers only upon baseline BT failure, demonstrating successful navigation around unexpected obstacles (e.g., street blockage) with no human intervention. Compared to a static BT baseline, this approach is a proof-of-concept that extends to diverse driving scenarios.
翻译:自主车辆(AVs)需要自适应行为规划器,以在不可预测的真实世界环境中安全导航。传统行为树(BTs)提供了结构化的决策逻辑,但本质上是静态的,且需要劳动密集的人工调优,这限制了其在SAE 5级自动驾驶中的适用性。本文提出一种智能体框架,利用大语言模型(LLMs)和多模态视觉模型(LVMs)动态生成和调整行为树。一个专用的描述器智能体应用符号链提示来评估场景关键性,规划器智能体通过上下文学习构建高层子目标,生成器智能体则以XML格式合成可执行的行为树子树。该系统集成于CARLA+Nav2仿真环境中,仅在基线行为树失效时触发,成功展示了在无人工干预下绕行意外障碍物(例如街道堵塞)的导航能力。与静态行为树基线相比,该方法是一个概念验证,可扩展至多样化的驾驶场景。