In-Context Unlearning: Language Models as Few Shot Unlearners

Machine unlearning, the study of efficiently removing the impact of specific training instances on a model, has garnered increased attention in recent years due to regulatory guidelines such as the \emph{Right to be Forgotten}. Achieving precise unlearning typically involves fully retraining the model and is computationally infeasible in case of very large models such as Large Language Models (LLMs). To this end, recent work has proposed several algorithms which approximate the removal of training data without retraining the model. These algorithms crucially rely on access to the model parameters in order to update them, an assumption that may not hold in practice due to computational constraints or having only query access to the LLMs. In this work, we propose a new class of unlearning methods for LLMs called ``In-Context Unlearning.'' This method unlearns instances from the model by simply providing specific kinds of inputs in context, without the need to update model parameters. To unlearn specific training instances, we present these instances to the LLMs at inference time along with labels that differ from their ground truth. Our experimental results demonstrate that in-context unlearning performs on par with, or in some cases outperforms other state-of-the-art methods that require access to model parameters, effectively removing the influence of specific instances on the model while preserving test accuracy.

翻译：机器遗忘，即研究如何高效消除特定训练实例对模型影响的方法，近年来因《被遗忘权》等监管准则而受到越来越多的关注。实现精确遗忘通常需要完全重新训练模型，对于大型语言模型等超大规模模型而言，这在计算上是不可行的。为此，近期研究提出了多种无需重新训练模型即可近似移除训练数据的算法。这些算法关键依赖于访问模型参数以进行更新，但在实际中可能因计算限制或仅具有对LLMs的查询权限而无法满足此假设。本研究针对LLMs提出了一类名为“上下文遗忘”的新型遗忘方法。该方法仅需在上下文中提供特定类型的输入即可实现实例遗忘，无需更新模型参数。为遗忘特定训练实例，我们在推理阶段将这些实例与偏离其真实标签的标注一同呈现给LLMs。实验结果表明，上下文遗忘方法的性能与需要访问模型参数的现有先进方法相当，在某些情况下甚至更优，能够有效消除特定实例对模型的影响，同时保持测试准确率。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/