Reinforcement learning often uses neural networks to solve complex control tasks. However, neural networks are sensitive to input perturbations, which makes their deployment in safety-critical environments challenging. This work lifts recent results from formally verifying neural networks against such disturbances to reinforcement learning in continuous state and action spaces using reachability analysis. While previous work mainly focuses on adversarial attacks for robust reinforcement learning, we train neural networks utilizing entire sets of perturbed inputs and maximize the worst-case reward. The obtained agents are verifiably more robust than agents obtained by related work, making them more applicable in safety-critical environments. This is demonstrated with an extensive empirical evaluation of four different benchmarks.
翻译:强化学习常采用神经网络解决复杂控制任务。然而神经网络对输入扰动极为敏感,这使其在安全关键环境中的部署面临挑战。本研究通过可达性分析,将近期关于神经网络抗扰动形式化验证的成果拓展至连续状态与动作空间的强化学习领域。现有研究主要关注对抗攻击下的鲁棒强化学习,而本工作利用完整扰动输入集合训练神经网络,并最大化最坏情况奖励。所得智能体相比相关研究具有可验证的更高鲁棒性,在安全关键环境中更具应用潜力。该结论通过对四个不同基准的广泛实证评估得以验证。