In recent years, lightweight large language models (LLMs) have garnered significant attention in the robotics field due to their low computational resource requirements and suitability for edge deployment. However, in task planning -- particularly for complex tasks that involve dynamic semantic logic reasoning -- lightweight LLMs have underperformed. To address this limitation, we propose a novel task planner, LightPlanner, which enhances the performance of lightweight LLMs in complex task planning by fully leveraging their reasoning capabilities. Unlike conventional planners that use fixed skill templates, LightPlanner controls robot actions via parameterized function calls, dynamically generating parameter values. This approach allows for fine-grained skill control and improves task planning success rates in complex scenarios. Furthermore, we introduce hierarchical deep reasoning. Before generating each action decision step, LightPlanner thoroughly considers three levels: action execution (feedback verification), semantic parsing (goal consistency verification), and parameter generation (parameter validity verification). This ensures the correctness of subsequent action controls. Additionally, we incorporate a memory module to store historical actions, thereby reducing context length and enhancing planning efficiency for long-term tasks. We train the LightPlanner-1.5B model on our LightPlan-40k dataset, which comprises 40,000 action controls across tasks with 2 to 13 action steps. Experiments demonstrate that our model achieves the highest task success rate despite having the smallest number of parameters. In tasks involving spatial semantic reasoning, the success rate exceeds that of ReAct by 14.9 percent. Moreover, we demonstrate LightPlanner's potential to operate on edge devices.
翻译:近年来,轻量级大语言模型因其计算资源需求低、适合边缘部署的特点,在机器人领域获得了广泛关注。然而,在任务规划——尤其是涉及动态语义逻辑推理的复杂任务中——轻量级大语言模型的表现不尽如人意。为克服这一局限,我们提出了一种新颖的任务规划器 LightPlanner,它通过充分挖掘轻量级大语言模型的推理潜力,显著提升了其在复杂任务规划中的性能。与使用固定技能模板的传统规划器不同,LightPlanner 通过参数化函数调用控制机器人动作,并动态生成参数值。这种方法实现了细粒度的技能控制,提高了复杂场景下的任务规划成功率。此外,我们引入了分层深度推理机制。在生成每个动作决策步骤前,LightPlanner 会全面考量三个层面:动作执行(反馈验证)、语义解析(目标一致性验证)以及参数生成(参数有效性验证)。这确保了后续动作控制的正确性。同时,我们集成了一个记忆模块来存储历史动作,从而减少上下文长度,提升长期任务的规划效率。我们在自建的 LightPlan-40k 数据集上训练了 LightPlanner-1.5B 模型,该数据集包含涵盖 2 到 13 个动作步骤任务的 40,000 条动作控制指令。实验表明,我们的模型在参数量最小的情况下实现了最高的任务成功率。在涉及空间语义推理的任务中,其成功率较 ReAct 高出 14.9%。此外,我们展示了 LightPlanner 在边缘设备上运行的潜力。