Task-Aware Machine Unlearning and Its Application in Load Forecasting

Data privacy and security have become a non-negligible factor in load forecasting. Previous researches mainly focus on training stage enhancement. However, once the model is trained and deployed, it may need to `forget' (i.e., remove the impact of) part of training data if the data is found to be malicious or as requested by the data owner. This paper introduces machine unlearning algorithm which is specifically designed to remove the influence of part of the original dataset on an already trained forecaster. However, direct unlearning inevitably degrades the model generalization ability. To balance between unlearning completeness and performance degradation, a performance-aware algorithm is proposed by evaluating the sensitivity of local model parameter change using influence function and sample re-weighting. Moreover, we observe that the statistic criterion cannot fully reflect the operation cost of down-stream tasks. Therefore, a task-aware machine unlearning is proposed whose objective is a tri-level optimization with dispatch and redispatch problems considered. We theoretically prove the existence of the gradient of such objective, which is key to re-weighting the remaining samples. We test the unlearning algorithms on linear and neural network load forecasters with realistic load dataset. The simulation demonstrates the balance on unlearning completeness and operational cost. All codes can be found at https://github.com/xuwkk/task_aware_machine_unlearning.

翻译：数据隐私与安全已成为负荷预测中不可忽视的因素。以往研究主要关注训练阶段的增强。然而，一旦模型训练并部署后，若发现部分训练数据存在恶意或应数据所有者要求，模型可能需要“遗忘”（即移除其影响）。本文引入了一种专门设计的机器遗忘算法，旨在消除原始数据集部分内容对已训练预测模型的影响。然而，直接遗忘不可避免地会降低模型的泛化能力。为平衡遗忘完整性与性能衰减，提出了一种性能感知算法，通过影响函数和样本重加权评估局部模型参数变化的敏感性。此外，我们观察到统计准则无法充分反映下游任务的运行成本。因此，提出了一种任务感知型机器遗忘，其目标为考虑调度和再调度问题的三层次优化。我们从理论上证明了该目标梯度的存在性，这是对剩余样本进行重加权的关键。我们在线性与神经网络负荷预测器上，使用真实负荷数据集测试了遗忘算法。仿真结果展示了遗忘完整性与运行成本之间的平衡。所有代码见https://github.com/xuwkk/task_aware_machine_unlearning。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/