Fundamental Problems With Model Editing: How Should Rational Belief Revision Work in LLMs?

The model editing problem concerns how language models should learn new facts about the world over time. While empirical research on model editing has drawn widespread attention, the conceptual foundations of model editing remain shaky -- perhaps unsurprisingly, since model editing is essentially belief revision, a storied problem in philosophy that has eluded succinct solutions for decades. Model editing nonetheless demands a solution, since we need to be able to control the knowledge within language models. With this goal in mind, this paper critiques the standard formulation of the model editing problem and proposes a formal testbed for model editing research. We first describe 12 open problems with model editing, based on challenges with (1) defining the problem, (2) developing benchmarks, and (3) assuming LLMs have editable beliefs in the first place. Many of these challenges are extremely difficult to address, e.g. determining far-reaching consequences of edits, labeling probabilistic entailments between facts, and updating beliefs of agent simulators. Next, we introduce a semi-synthetic dataset for model editing based on Wikidata, where we can evaluate edits against labels given by an idealized Bayesian agent. This enables us to say exactly how belief revision in language models falls short of a desirable epistemic standard. We encourage further research exploring settings where such a gold standard can be compared against. Our code is publicly available at: https://github.com/peterbhase/LLM-belief-revision

翻译：模型编辑问题关注语言模型应如何随时间学习关于世界的新事实。尽管模型编辑的实证研究已引起广泛关注，但其概念基础仍不稳固——这或许并不令人意外，因为模型编辑本质上是信念修正，这是哲学中一个历史悠久且数十年来未能获得简洁解决方案的问题。然而模型编辑仍需解决方案，因为我们需要能够控制语言模型内部的知识。基于此目标，本文批判了模型编辑问题的标准表述，并提出了模型编辑研究的正式测试平台。我们首先基于以下三方面的挑战，阐述了模型编辑的12个开放性问题：(1)问题定义，(2)基准开发，(3)最初假设LLM具有可编辑信念。其中许多挑战极难解决，例如确定编辑的深远影响、标注事实间的概率蕴涵关系，以及更新智能体模拟器的信念。接着，我们引入了一个基于Wikidata的半合成模型编辑数据集，其中可依据理想化贝叶斯智能体给出的标签评估编辑效果。这使我们能精确说明语言模型中的信念修正与理想认知标准之间的差距。我们鼓励进一步探索能与此类黄金标准进行比较的研究场景。我们的代码公开于：https://github.com/peterbhase/LLM-belief-revision

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日