Machine learning models trained on vast amounts of real or synthetic data often achieve outstanding predictive performance across various domains. However, this utility comes with increasing concerns about privacy, as the training data may include sensitive information. To address these concerns, machine unlearning has been proposed to erase specific data samples from models. While some unlearning techniques efficiently remove data at low costs, recent research highlights vulnerabilities where malicious users could request unlearning on manipulated data to compromise the model. Despite these attacks' effectiveness, perturbed data differs from original training data, failing hash verification. Existing attacks on machine unlearning also suffer from practical limitations and require substantial additional knowledge and resources. To fill the gaps in current unlearning attacks, we introduce the Unlearning Usability Attack. This model-agnostic, unlearning-agnostic, and budget-friendly attack distills data distribution information into a small set of benign data. These data are identified as benign by automatic poisoning detection tools due to their positive impact on model training. While benign for machine learning, unlearning these data significantly degrades model information. Our evaluation demonstrates that unlearning this benign data, comprising no more than 1% of the total training data, can reduce model accuracy by up to 50%. Furthermore, our findings show that well-prepared benign data poses challenges for recent unlearning techniques, as erasing these synthetic instances demands higher resources than regular data. These insights underscore the need for future research to reconsider "data poisoning" in the context of machine unlearning.
翻译:在大量真实或合成数据上训练的机器学习模型,通常能在各个领域实现卓越的预测性能。然而,这种效用伴随着日益增长的隐私担忧,因为训练数据可能包含敏感信息。为应对这些担忧,机器遗忘技术被提出,旨在从模型中擦除特定的数据样本。尽管一些遗忘技术能以较低成本高效地移除数据,但近期研究揭示了其脆弱性:恶意用户可能请求对经过操纵的数据进行遗忘,从而损害模型性能。尽管这些攻击具有效力,但扰动数据与原始训练数据存在差异,无法通过哈希验证。此外,现有的机器遗忘攻击还存在实际局限性,需要大量的额外知识和资源。为填补当前遗忘攻击的空白,我们提出了遗忘可用性攻击。这种攻击与模型无关、与遗忘技术无关且成本低廉,它将数据分布信息提炼为一小部分良性数据。由于这些数据对模型训练具有积极影响,它们被自动投毒检测工具识别为良性。虽然对机器学习而言是良性的,但遗忘这些数据会显著降低模型的信息量。我们的评估表明,遗忘这些不超过总训练数据1%的良性数据,可使模型准确率降低高达50%。此外,我们的研究结果显示,精心准备的良性数据对近期的遗忘技术构成了挑战,因为擦除这些合成实例比常规数据需要更高的资源。这些发现强调,未来的研究需要在机器遗忘的背景下重新审视“数据投毒”问题。