Large language models (LLMs) show their powerful automatic reasoning and planning capability with a wealth of semantic knowledge about the human world. However, the grounding problem still hinders the applications of LLMs in the real-world environment. Existing studies try to fine-tune the LLM or utilize pre-defined behavior APIs to bridge the LLMs and the environment, which not only costs huge human efforts to customize for every single task but also weakens the generality strengths of LLMs. To autonomously ground the LLM onto the environment, we proposed the Self-Driven Grounding (SDG) framework to automatically and progressively ground the LLM with self-driven skill learning. SDG first employs the LLM to propose the hypothesis of sub-goals to achieve tasks and then verify the feasibility of the hypothesis via interacting with the underlying environment. Once verified, SDG can then learn generalized skills with the guidance of these successfully grounded subgoals. These skills can be further utilized to accomplish more complex tasks which fail to pass the verification phase. Verified in the famous instruction following task set-BabyAI, SDG achieves comparable performance in the most challenging tasks compared with imitation learning methods that cost millions of demonstrations, proving the effectiveness of learned skills and showing the feasibility and efficiency of our framework.
翻译:大语言模型(LLM)凭借其丰富的人类世界语义知识,展现出强大的自动推理与规划能力。然而,落地问题(grounding problem)依然制约着LLM在真实环境中的应用。现有研究尝试微调LLM或利用预定义的行为API来桥接LLM与环境,这不仅需要针对每项任务耗费大量人力进行定制,还削弱了LLM的通用性优势。为自主实现LLM向环境的落地,我们提出自驱式落地(SDG)框架,通过自驱技能学习自动且渐进地将LLM与环境对齐。SDG首先利用LLM提出实现任务的子目标假设,随后通过与底层环境交互验证假设的可行性。一旦验证通过,SDG即可基于成功落地的子目标引导学习泛化性技能。这些技能可进一步用于完成无法通过验证阶段的更复杂任务。在著名的指令遵循任务集BabyAI上验证表明,SDG在最具挑战性的任务中取得了与需耗费数百万演示样本的模仿学习方法相当的性能,证明了所学技能的有效性,同时展示了我们框架的可行性与高效性。