Recent research on instructable agents has used memory-augmented Large Language Models (LLMs) as task planners, a technique that retrieves language-program examples relevant to the input instruction and uses them as in-context examples in the LLM prompt to improve the performance of the LLM in inferring the correct action and task plans. In this technical report, we extend the capabilities of HELPER, by expanding its memory with a wider array of examples and prompts, and by integrating additional APIs for asking questions. This simple expansion of HELPER into a shared memory enables the agent to work across the domains of executing plans from dialogue, natural language instruction following, active question asking, and commonsense room reorganization. We evaluate the agent on four diverse interactive visual-language embodied agent benchmarks: ALFRED, TEACh, DialFRED, and the Tidy Task. HELPER-X achieves few-shot, state-of-the-art performance across these benchmarks using a single agent, without requiring in-domain training, and remains competitive with agents that have undergone in-domain training.
翻译:近期关于可指令智能体的研究利用记忆增强型大语言模型(LLMs)作为任务规划器:该技术通过检索与输入指令相关的语言程序示例,并将其作为LLM提示中的上下文示例,提升LLM推断正确动作和任务规划的能力。本技术报告通过扩展HELPER的记忆库以纳入更广泛的示例与提示,并集成额外的提问应用程序接口(API),进一步增强了其功能。这种将HELPER扩展为共享记忆体的简单改进,使其能够跨越对话计划执行、自然语言指令跟随、主动提问及常识性房间重组等领域的任务。我们在四个不同的交互式视觉-语言具身智能体基准测试(ALFRED、TEACh、DialFRED和Tidy Task)上评估了该智能体。HELPER-X在无需领域内训练的情况下,通过单一智能体在这些基准测试中实现了少样本学习下的最先进性能,并与经过领域内训练的智能体保持竞争性。