Unveiling Entity-Level Unlearning for Large Language Models: A Comprehensive Analysis

Large language model unlearning has garnered increasing attention due to its potential to address security and privacy concerns, leading to extensive research in the field. However, much of this research has concentrated on instance-level unlearning, specifically targeting the removal of predefined instances containing sensitive content. This focus has left a significant gap in the exploration of full entity-level unlearning, which is critical in real-world scenarios such as copyright protection. To this end, we propose a novel task of Entity-level unlearning, which aims to erase entity-related knowledge from the target model completely. To thoroughly investigate this task, we systematically evaluate trending unlearning algorithms, revealing that current methods struggle to achieve effective entity-level unlearning. Then, we further explore the factors that influence the performance of the unlearning algorithms, identifying that knowledge coverage and the size of the forget set play pivotal roles. Notably, our analysis also uncovers that entities introduced through fine-tuning are more vulnerable to unlearning than pre-trained entities. These findings collectively offer valuable insights for advancing entity-level unlearning for LLMs.

翻译：大语言模型遗忘因其在解决安全与隐私问题方面的潜力而日益受到关注，推动了该领域的广泛研究。然而，现有研究多集中于实例级遗忘，即专注于移除包含敏感内容的预定义实例。这一侧重点导致了对完整实体级遗忘探索的显著空白，而实体级遗忘在版权保护等现实场景中至关重要。为此，我们提出了一种新颖的实体级遗忘任务，其目标是从目标模型中彻底擦除与实体相关的知识。为深入探究此任务，我们系统评估了当前流行的遗忘算法，发现现有方法难以实现有效的实体级遗忘。随后，我们进一步探索了影响遗忘算法性能的因素，指出知识覆盖范围与遗忘集的大小起着关键作用。值得注意的是，我们的分析还揭示，通过微调引入的实体比预训练实体更容易被遗忘。这些发现共同为推进大语言模型的实体级遗忘研究提供了宝贵的见解。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/