Limitations of Agents Simulated by Predictive Models

There is increasing focus on adapting predictive models into agent-like systems, most notably AI assistants based on language models. We outline two structural reasons for why these models can fail when turned into agents. First, we discuss auto-suggestive delusions. Prior work has shown theoretically that models fail to imitate agents that generated the training data if the agents relied on hidden observations: the hidden observations act as confounding variables, and the models treat actions they generate as evidence for nonexistent observations. Second, we introduce and formally study a related, novel limitation: predictor-policy incoherence. When a model generates a sequence of actions, the model's implicit prediction of the policy that generated those actions can serve as a confounding variable. The result is that models choose actions as if they expect future actions to be suboptimal, causing them to be overly conservative. We show that both of those failures are fixed by including a feedback loop from the environment, that is, re-training the models on their own actions. We give simple demonstrations of both limitations using Decision Transformers and confirm that empirical results agree with our conceptual and formal analysis. Our treatment provides a unifying view of those failure modes, and informs the question of why fine-tuning offline learned policies with online learning makes them more effective.

翻译：越来越多的研究关注将预测模型改造为类智能体系统，其中最突出的是基于语言模型的AI助手。我们概述了此类模型转化为智能体时可能失效的两个结构性原因。首先，我们讨论了"自我暗示妄想"问题。先前理论研究表明，若训练数据生成过程中智能体依赖隐藏观测变量，模型将无法模仿该智能体行为：隐藏观测变量作为混杂因子，导致模型将自身生成的动作视为非真实观测的证据。其次，我们引入并正式研究了相关的新型局限性——"预测者-策略不一致性"。当模型生成动作序列时，其对生成该动作序列的策略的隐含预测会作为混杂变量。这导致模型在选择动作时仿佛预期后续动作存在次优性，从而产生过度保守的行为。我们证明这两种失效可通过引入环境反馈回路（即基于模型自身动作进行重训练）加以修正。通过使用决策变压器进行的简单实验验证了这两个局限性，且实证结果与我们的概念及形式化分析一致。我们的研究为这些失效模式提供了统一视角，并解释了为何使用在线学习微调离线学习策略能提升其有效性。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日