An agent facing a planning problem can use answers to how-to questions to reduce uncertainty and fill knowledge gaps, helping it solve both current and future tasks. However, their open ended nature, where valid answers to "How do I X?" range from executable actions to high-level descriptions of X's sub-goals, makes them challenging for AI agents to ask, and for AI experts to answer, in ways that support efficient planning. We introduce $How^{2}$, a memory agent framework that enables agents to ask how-to questions, store the answers, and reuse them for lifelong learning in interactive environments. We evaluate our approach in Plancraft, a Minecraft crafting environment, where agents must complete an assembly task by manipulating inventory items. Using teacher models that answer at varying levels of abstraction, from executable action sequences to high-level subgoal descriptions, we show that lifelong learning agents benefit most from answers that are abstracted and decoupled from the current state. $How^{2}$ offers a way for LLM-based agents to improve their planning capabilities over time by asking questions in interactive environments.
翻译:面对规划问题的智能体可以通过回答“如何做”问题来减少不确定性并填补知识空白,从而帮助其解决当前及未来的任务。然而,这类问题的开放性本质——对“我如何做X?”的有效回答范围涵盖从可执行动作到X的子目标高层描述——使得AI智能体难以以支持高效规划的方式提问,也让AI专家难以有效回答。我们提出How²框架,这是一种记忆智能体框架,使智能体能够提出“如何做”问题、存储答案,并在交互环境中重复使用这些答案进行终身学习。我们在Plancraft(一个《我的世界》合成环境)中评估了我们的方法,在该环境中,智能体必须通过操作库存物品来完成装配任务。通过使用在不同抽象层次(从可执行动作序列到高层子目标描述)回答问题的教师模型,我们证明了终身学习智能体最能从抽象化且与当前状态解耦的答案中获益。How²为基于LLM的智能体提供了一种途径,使其能够在交互环境中通过提问来持续提升规划能力。