As Large Language Models (LLMs) become increasingly prevalent, their security vulnerabilities have already drawn attention. Machine unlearning is introduced to seek to mitigate these risks by removing the influence of undesirable data. However, existing methods not only rely on the retained dataset to preserve model utility, but also suffer from cumulative catastrophic utility loss under continuous unlearning requests. To solve this dilemma, we propose a novel method, called Rotation Control Unlearning (RCU), which leverages the rotational salience weight of RCU to quantify and control the unlearning degree in the continuous unlearning process. The skew symmetric loss is designed to construct the existence of the cognitive rotation space, where the changes of rotational angle can simulate the continuous unlearning process. Furthermore, we design an orthogonal rotation axes regularization to enforce mutually perpendicular rotation directions for continuous unlearning requests, effectively minimizing interference and addressing cumulative catastrophic utility loss. Experiments on multiple datasets confirm that our method without retained dataset achieves SOTA performance.
翻译:随着大语言模型(LLMs)日益普及,其安全漏洞已引起广泛关注。机器遗忘技术旨在通过消除不良数据的影响来缓解这些风险。然而,现有方法不仅依赖保留数据集以维持模型效用,而且在连续遗忘请求下会遭受累积性灾难性效用损失。为解决这一困境,本文提出一种名为旋转控制遗忘(RCU)的新方法,该方法利用RCU的旋转显著性权重来量化与控制连续遗忘过程中的遗忘程度。通过设计斜对称损失函数构建认知旋转空间的存在性,其中旋转角度的变化可模拟连续遗忘过程。此外,我们提出正交旋转轴正则化方法,强制连续遗忘请求的旋转方向相互垂直,从而有效减少干扰并解决累积性灾难性效用损失问题。在多个数据集上的实验表明,本方法无需保留数据集即可实现最先进的性能。