A Novel Bifurcation Method for Observation Perturbation Attacks on Reinforcement Learning Agents: Load Altering Attacks on a Cyber Physical Power System

2024 年 7 月 6 日

翻译：一种针对强化学习智能体的观测扰动攻击新分岔方法：网络物理电力系统中的负荷改变攻击

Kiernan Broda-Milian,Ranwa Al-Mallah,Hanane Dagdougui

from arxiv, 12 pages, 5 figures

Components of cyber physical systems, which affect real-world processes, are often exposed to the internet. Replacing conventional control methods with Deep Reinforcement Learning (DRL) in energy systems is an active area of research, as these systems become increasingly complex with the advent of renewable energy sources and the desire to improve their efficiency. Artificial Neural Networks (ANN) are vulnerable to specific perturbations of their inputs or features, called adversarial examples. These perturbations are difficult to detect when properly regularized, but have significant effects on the ANN's output. Because DRL uses ANN to map optimal actions to observations, they are similarly vulnerable to adversarial examples. This work proposes a novel attack technique for continuous control using Group Difference Logits loss with a bifurcation layer. By combining aspects of targeted and untargeted attacks, the attack significantly increases the impact compared to an untargeted attack, with drastically smaller distortions than an optimally targeted attack. We demonstrate the impacts of powerful gradient-based attacks in a realistic smart energy environment, show how the impacts change with different DRL agents and training procedures, and use statistical and time-series analysis to evaluate attacks' stealth. The results show that adversarial attacks can have significant impacts on DRL controllers, and constraining an attack's perturbations makes it difficult to detect. However, certain DRL architectures are far more robust, and robust training methods can further reduce the impact.

翻译：网络物理系统的组件常暴露于互联网，这些组件会影响现实世界的过程。随着可再生能源的出现以及对提高效率的需求，能源系统日益复杂，因此用深度强化学习（DRL）替代传统控制方法成为当前研究的热点。人工神经网络（ANN）对其输入或特征的特定扰动（称为对抗样本）具有脆弱性。当经过适当正则化时，这些扰动难以被检测，但对ANN的输出具有显著影响。由于DRL使用ANN将最优动作映射到观测值，因此同样易受对抗样本攻击。本研究提出了一种利用分岔层结合组间对数损失进行连续控制的新型攻击技术。通过融合定向与非定向攻击的特点，该攻击在显著小于最优定向攻击的畸变下，其影响远超非定向攻击。我们在现实的智能能源环境中展示了基于梯度的强大攻击的影响，说明了不同DRL智能体与训练程序如何改变攻击效果，并运用统计与时间序列分析评估攻击的隐蔽性。结果表明，对抗攻击对DRL控制器具有显著影响，且约束攻击的扰动可使其难以被检测。然而，某些DRL架构具有更强的鲁棒性，而鲁棒训练方法能进一步降低攻击的影响。