Large Language Model (LLM) unlearning aims to remove targeted knowledge from a trained model, but practical deployments often require post-training quantization (PTQ) for efficient inference. However, aggressive low-bit PTQ can mask or erase unlearning updates, causing quantized models to revert to pre-unlearning behavior. We show that standard full-parameter fine-tuning often induce parameter changes that are too small to survive 4-bit quantization. We propose quantization-robust unlearning via low-rank adaptation (LoRA): we freeze the base model and concentrate unlearning into trainable adapters so that the effective update is preserved after quantization. On Llama-2-7B evaluated with MUSE dataset (BOOKS and NEWS), LoRA improves 4-bit utility by up to 7.93 points (NPO+GDR on BOOKS: 50.17 to 58.10) and yields higher 4-bit utility on NEWS for GA+GDR (40.06 to 44.82, increase of 4.76). LoRA also substantially reduces privacy leakage under 4-bit PTQ, e.g., for GA+KLR on BOOKS, PrivLeak moves from -25.68 to -5.86 (closer to ideal 0), while maintaining strong forgetting (VerMem and KnowMem near 0). Thus, using LoRA for Machine Unlearning is beneficial for scenarios where quantization is necessary for model deployment.
翻译:大语言模型(LLM)遗忘旨在从训练好的模型中移除特定知识,但实际部署常需采用训练后量化(PTQ)以实现高效推理。然而,激进的低位宽PTQ可能掩盖或消除遗忘更新,导致量化模型恢复至遗忘前的行为。我们发现,标准的全参数微调通常引发的参数变化过小,难以在4位量化中保留。我们提出通过低秩适配(LoRA)实现量化鲁棒的遗忘:冻结基础模型并将遗忘集中于可训练适配器中,从而确保有效更新在量化后得以保留。在基于MUSE数据集(BOOKS与NEWS)评估的Llama-2-7B模型中,LoRA将4位量化的效用提升最高达7.93分(BOOKS数据集上NPO+GDR:50.17至58.10),并在NEWS数据集上使GA+GDR的4位效用从40.06提升至44.82(增加4.76)。LoRA还显著降低了4位PTQ下的隐私泄露风险,例如在BOOKS数据集上采用GA+KLR时,PrivLeak指标从-25.68改善至-5.86(更接近理想值0),同时保持强遗忘效果(VerMem与KnowMem接近0)。因此,在模型部署需进行量化的场景中,采用LoRA实现机器遗忘具有显著优势。