Task-Aware Machine Unlearning and Its Application in Load Forecasting

Data privacy and security have become a non-negligible factor in load forecasting. Previous researches mainly focus on training stage enhancement. However, once the model is trained and deployed, it may need to `forget' (i.e., remove the impact of) part of training data if the these data are found to be malicious or as requested by the data owner. This paper introduces the concept of machine unlearning which is specifically designed to remove the influence of part of the dataset on an already trained forecaster. However, direct unlearning inevitably degrades the model generalization ability. To balance between unlearning completeness and model performance, a performance-aware algorithm is proposed by evaluating the sensitivity of local model parameter change using influence function and sample re-weighting. Furthermore, we observe that the statistical criterion such as mean squared error, cannot fully reflect the operation cost of the downstream tasks in power system. Therefore, a task-aware machine unlearning is proposed whose objective is a trilevel optimization with dispatch and redispatch problems considered. We theoretically prove the existence of the gradient of such an objective, which is key to re-weighting the remaining samples. We tested the unlearning algorithms on linear, CNN, and MLP-Mixer based load forecasters with a realistic load dataset. The simulation demonstrates the balance between unlearning completeness and operational cost. All codes can be found at https://github.com/xuwkk/task_aware_machine_unlearning.

翻译：数据隐私和安全已成为负荷预测中不可忽视的因素。以往研究主要关注训练阶段的增强。然而，一旦模型训练并部署后，若发现部分训练数据存在恶意性或应数据所有者要求，模型可能需要“遗忘”（即移除这些数据的影响）。本文引入专为移除已训练预测器中部分数据集影响而设计的机器遗忘概念。但直接遗忘会不可避免地降低模型的泛化能力。为平衡遗忘完备性与模型性能，本文通过使用影响函数和样本重加权评估局部模型参数变化的敏感性，提出了一种性能感知算法。此外，我们观察到均方误差等统计指标无法完全反映电力系统下游任务的运行成本。因此，提出了一种任务感知的机器遗忘方法，其目标函数为考虑调度和再调度问题的三层优化模型。我们从理论上证明了该目标函数梯度的存在性，这是对剩余样本进行重加权处理的关键。我们在基于线性、CNN和MLP-Mixer的负荷预测器上，使用真实负荷数据集测试了遗忘算法。仿真结果展示了遗忘完备性与运行成本之间的平衡。所有代码可在 https://github.com/xuwkk/task_aware_machine_unlearning 获取。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/