With recent legislation on the right to be forgotten, machine unlearning has emerged as a crucial research area. It facilitates the removal of a user's data from federated trained machine learning models without the necessity for retraining from scratch. However, current machine unlearning algorithms are confronted with challenges of efficiency and validity. To address the above issues, we propose a new framework, named Goldfish. It comprises four modules: basic model, loss function, optimization, and extension. To address the challenge of low validity in existing machine unlearning algorithms, we propose a novel loss function. It takes into account the loss arising from the discrepancy between predictions and actual labels in the remaining dataset. Simultaneously, it takes into consideration the bias of predicted results on the removed dataset. Moreover, it accounts for the confidence level of predicted results. Additionally, to enhance efficiency, we adopt knowledge a distillation technique in the basic model and introduce an optimization module that encompasses the early termination mechanism guided by empirical risk and the data partition mechanism. Furthermore, to bolster the robustness of the aggregated model, we propose an extension module that incorporates a mechanism using adaptive distillation temperature to address the heterogeneity of user local data and a mechanism using adaptive weight to handle the variety in the quality of uploaded models. Finally, we conduct comprehensive experiments to illustrate the effectiveness of proposed approach.
翻译:随着近期关于“被遗忘权”的立法,机器遗忘已成为一个关键的研究领域。它有助于从联邦训练的机器学习模型中移除用户数据,而无需从头重新训练。然而,当前的机器遗忘算法面临效率和有效性的挑战。为解决上述问题,我们提出了一种名为“金鱼”的新框架。该框架包含四个模块:基础模型、损失函数、优化和扩展。针对现有机器遗忘算法有效性低的问题,我们提出了一种新的损失函数。该损失函数考虑了剩余数据集中预测值与实际标签之间差异所造成的损失;同时,它也考虑了移除数据集上预测结果的偏差;此外,还考虑了预测结果的置信水平。为了提高效率,我们在基础模型中采用了知识蒸馏技术,并引入了一个优化模块,该模块包含由经验风险引导的提前终止机制和数据分区机制。此外,为了增强聚合模型的鲁棒性,我们提出了一个扩展模块,该模块包含一个利用自适应蒸馏温度来处理用户本地数据异质性的机制,以及一个利用自适应权重来处理上传模型质量差异的机制。最后,我们进行了全面的实验来证明所提方法的有效性。