In-Context Unlearning: Language Models as Few Shot Unlearners

Machine unlearning, the study of efficiently removing the impact of specific training instances on a model, has garnered increased attention in recent years due to regulatory guidelines such as the \emph{Right to be Forgotten}. Achieving precise unlearning typically involves fully retraining the model and is computationally infeasible in case of very large models such as Large Language Models (LLMs). To this end, recent work has proposed several algorithms which approximate the removal of training data without retraining the model. These algorithms crucially rely on access to the model parameters in order to update them, an assumption that may not hold in practice due to computational constraints or having only query access to the LLMs. In this work, we propose a new class of unlearning methods for LLMs called ``In-Context Unlearning.'' This method unlearns instances from the model by simply providing specific kinds of inputs in context, without the need to update model parameters. To unlearn specific training instances, we present these instances to the LLMs at inference time along with labels that differ from their ground truth. Our experimental results demonstrate that in-context unlearning performs on par with, or in some cases outperforms other state-of-the-art methods that require access to model parameters, effectively removing the influence of specific instances on the model while preserving test accuracy.

翻译：机器遗忘，即研究如何高效移除特定训练实例对模型的影响，近年来因《被遗忘权》等监管准则而受到越来越多的关注。实现精确遗忘通常需要完全重新训练模型，对于大型语言模型（LLMs）等超大规模模型而言，这在计算上是不可行的。为此，近期研究提出了多种无需重新训练模型即可近似移除训练数据的算法。这些算法关键依赖于访问模型参数以进行更新，这一假设在实践中可能因计算限制或仅拥有对LLMs的查询访问权限而无法成立。在本工作中，我们提出了一类针对LLMs的新型遗忘方法，称为“上下文遗忘”。该方法通过在上下文中提供特定类型的输入来实现对实例的遗忘，无需更新模型参数。为了遗忘特定训练实例，我们在推理阶段将这些实例与不同于其真实标签的标签一同呈现给LLMs。实验结果表明，上下文遗忘的性能与需要访问模型参数的其他最先进方法相当，在某些情况下甚至更优，能有效移除特定实例对模型的影响，同时保持测试准确率。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/