Large language models (LLMs) offer significant promise as a knowledge source for task learning. Prompt engineering has been shown to be effective for eliciting knowledge from an LLM, but alone it is insufficient for acquiring relevant, situationally grounded knowledge for an embodied agent learning novel tasks. We describe a cognitive-agent approach, STARS, that extends and complements prompt engineering, mitigating its limitations and thus enabling an agent to acquire new task knowledge matched to its native language capabilities, embodiment, environment, and user preferences. The STARS approach is to increase the response space of LLMs and deploy general strategies, embedded within the autonomous agent, to evaluate, repair, and select among candidate responses produced by the LLM. We describe the approach and experiments that show how an agent, by retrieving and evaluating a breadth of responses from the LLM, can achieve 77-94% task completion in one-shot learning without user oversight. The approach achieves 100% task completion when human oversight (such as an indication of preference) is provided. Further, the type of oversight largely shifts from explicit, natural language instruction to simple confirmation/discomfirmation of high-quality responses that have been vetted by the agent before presentation to a user.
翻译:大型语言模型(LLMs)作为任务学习的知识源展现出巨大潜力。提示工程已被证明能有效从LLM中获取知识,但仅凭此方法不足以让具身智能体获取学习新任务所需的相关情境化知识。我们提出了一种认知智能体方法STARS,该方法扩展并补充了提示工程,通过弥补其局限性使智能体能够获取与自身语言能力、具身特征、环境及用户偏好相匹配的新任务知识。STARS方法的核心是扩大LLM的响应空间,并在自主智能体中嵌入通用策略,用于评估、修复和选择LLM生成的候选响应。本文介绍了该方法及实验成果:当智能体通过检索和评估LLM生成的广泛响应时,可以在无用户监督的情况下实现77-94%的即时学习任务完成率;当提供人类监督(如偏好指示)时,任务完成率可达100%。此外,监督类型从显式的自然语言指令,显著转变为用户对智能体自主筛选后的高质量响应进行简单确认/否定。