CLIN: A Continually Learning Language Agent for Rapid Task Adaptation and Generalization

Language agents have shown some ability to interact with an external environment, e.g., a virtual world such as ScienceWorld, to perform complex tasks, e.g., growing a plant, without the startup costs of reinforcement learning. However, despite their zero-shot capabilities, these agents to date do not continually improve over time beyond performance refinement on a specific task. Here we present CLIN, the first language-based agent to achieve this, so that it continually improves over multiple trials, including when both the environment and task are varied, and without requiring parameter updates. Our approach is to use a persistent, dynamic, textual memory centered on causal abstractions (rather than general "helpful hints") that is regularly updated after each trial so that the agent gradually learns useful knowledge for new trials. In the ScienceWorld benchmark, CLIN is able to continually improve on repeated trials on the same task and environment, outperforming state-of-the-art reflective language agents like Reflexion by 23 absolute points. CLIN can also transfer its learning to new environments (or new tasks), improving its zero-shot performance by 4 points (13 for new tasks) and can further improve performance there through continual memory updates, enhancing performance by an additional 17 points (7 for new tasks). This suggests a new architecture for agents built on frozen models that can still continually and rapidly improve over time.

翻译：语言智能体已展现出与外部环境（如ScienceWorld虚拟世界）交互以执行复杂任务（如种植植物）的能力，且无需强化学习的启动成本。然而，尽管具备零样本能力，现有智能体在特定任务性能优化之外，尚无法随时间持续提升。本文提出CLIN——首个实现这一目标的基于语言的智能体，它能在多次试验中持续改进（包括环境和任务变化时），且无需参数更新。我们的方法是使用以因果抽象（而非通用“实用提示”）为核心的持久动态文本记忆，每次试验后定期更新，使智能体逐步积累对新试验有用的知识。在ScienceWorld基准测试中，CLIN能在同一任务和环境的重复试验中持续提升，以23个绝对百分点的优势超越Reflexion等最先进的反思型语言智能体。CLIN还能将学习迁移到新环境（或新任务），使其零样本性能提升4个百分点（新任务为13个百分点），并通过持续记忆更新进一步增强——额外提升17个百分点（新任务为7个百分点）。这为基于冻结模型构建的智能体提供了一种新架构，使其能够持续且快速地进行改进。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日