Large Language Model (LLM) unlearning aims to remove targeted knowledge from a trained model, but practical deployments often require post-training quantization (PTQ) for efficient inference. However, aggressive low-bit PTQ can mask unlearning updates, causing quantized models to revert to pre-unlearning behavior. We show that standard full-parameter fine-tuning often induces parameter changes that are too small to survive 4-bit quantization. We propose quantization-robust unlearning via low-rank adaptation (LoRA): we freeze the base model and concentrate unlearning into trainable adapters so that the effective update is preserved after quantization. On Llama-2-7B evaluated with MUSE dataset (BOOKS and NEWS), LoRA improves 4-bit utility by up to 7.93 points (NPO+GDR on BOOKS: 50.17 to 58.10) and yields higher 4-bit utility on NEWS for GA+GDR (40.06 to 44.82, increase of 4.76). LoRA also substantially reduces privacy leakage under 4-bit PTQ, e.g., for GA+KLR on BOOKS, PrivLeak moves from -25.68 to -5.86 (closer to ideal 0), while maintaining strong forgetting (VerMem and KnowMem near 0). Thus, using LoRA for Machine Unlearning is beneficial for scenarios where quantization is necessary for model deployment.
翻译:大语言模型(LLM)遗忘旨在从已训练模型中移除特定知识,但其实际部署常需通过训练后量化(PTQ)实现高效推理。然而,激进的低位宽PTQ可能掩盖遗忘更新,导致量化模型恢复至遗忘前的行为。我们发现,标准全参数微调通常产生过小的参数变化,难以在4比特量化中幸存。为此,我们提出基于低秩适配(LoRA)的量化鲁棒遗忘方法:冻结基础模型并将遗忘过程集中于可训练适配器中,使有效更新在量化后得以保留。在基于MUSE数据集(BOOKS和NEWS)对Llama-2-7B的评估中,LoRA将4比特效用最高提升7.93个点(BOOKS上的NPO+GDR:50.17提升至58.10),并在NEWS上对GA+GDR实现了更高的4比特效用(40.06提升至44.82,增幅4.76)。LoRA还大幅降低了4比特PTQ下的隐私泄露风险:例如在BOOKS上对GA+KLR,PrivLeak从-25.68降至-5.86(更接近理想值0),同时保持强遗忘效果(VerMem和KnowMem接近0)。因此,在模型部署需要量化的场景中,将LoRA用于机器遗忘具有显著优势。