Performing complex manipulation tasks in dynamic environments requires efficient Task and Motion Planning (TAMP) approaches, which combine high-level symbolic plan with low-level motion planning. Advances in Large Language Models (LLMs), such as GPT-4, are transforming task planning by offering natural language as an intuitive and flexible way to describe tasks, generate symbolic plans, and reason. However, the effectiveness of LLM-based TAMP approaches is limited due to static and template-based prompting, which struggles in adapting to dynamic environments and complex task contexts. To address these limitations, this work proposes a novel ontology-driven prompt-tuning framework that employs knowledge-based reasoning to refine and expand user prompts with task contextual reasoning and knowledge-based environment state descriptions. Integrating domain-specific knowledge into the prompt ensures semantically accurate and context-aware task plans. The proposed framework demonstrates its effectiveness by resolving semantic errors in symbolic plan generation, such as maintaining logical temporal goal ordering in scenarios involving hierarchical object placement. The proposed framework is validated through both simulation and real-world scenarios, demonstrating significant improvements over the baseline approach in terms of adaptability to dynamic environments, and the generation of semantically correct task plans.
翻译:在动态环境中执行复杂操作任务需要高效的任务与运动规划方法,该方法需将高层符号规划与低层运动规划相结合。以GPT-4为代表的大语言模型的发展正通过提供自然语言这一直观灵活的任务描述、符号规划生成及推理方式,推动着任务规划领域的变革。然而,基于LLM的TAMP方法受限于静态模板化提示机制,难以适应动态环境与复杂任务场景。为突破这些局限,本研究提出一种新颖的本体驱动提示调优框架,该框架运用基于知识的推理技术,通过任务情境推理与基于知识的环境状态描述来优化和扩展用户提示。将领域特定知识融入提示过程,可确保生成语义准确且具备情境感知能力的任务规划。本框架通过解决符号规划生成中的语义错误(例如在涉及分层物体放置的场景中保持逻辑时序目标顺序),验证了其有效性。所提框架在仿真与真实场景中均得到验证,相较于基线方法,在动态环境适应性与语义正确任务规划生成方面均展现出显著提升。