Machine unlearning is gaining increasing attention as a way to remove adversarial data poisoning attacks from already trained models and to comply with privacy and AI regulations. The objective is to unlearn the effect of undesired data from a trained model while maintaining performance on the remaining data. This paper introduces HyperForget, a novel machine unlearning framework that leverages hypernetworks - neural networks that generate parameters for other networks - to dynamically sample models that lack knowledge of targeted data while preserving essential capabilities. Leveraging diffusion models, we implement two Diffusion HyperForget Networks and used them to sample unlearned models in Proof-of-Concept experiments. The unlearned models obtained zero accuracy on the forget set, while preserving good accuracy on the retain sets, highlighting the potential of HyperForget for dynamic targeted data removal and a promising direction for developing adaptive machine unlearning algorithms.
翻译:机器遗忘作为一种从已训练模型中移除对抗性数据投毒攻击并满足隐私与人工智能法规要求的技术,正受到日益广泛的关注。其目标是从训练好的模型中消除不良数据的影响,同时保持对剩余数据的性能。本文提出HyperForget——一种新颖的机器遗忘框架,该框架利用超网络(即为其他网络生成参数的神经网络)动态采样那些缺失目标数据知识却保留核心能力的模型。基于扩散模型,我们实现了两种扩散超网络遗忘架构,并在概念验证实验中用其采样遗忘模型。所得遗忘模型在遗忘集上达到零准确率,同时在保留集上保持良好准确率,这彰显了HyperForget在动态定向数据移除方面的潜力,并为开发自适应机器遗忘算法指明了前景方向。