Offline reinforcement learning (RL) addresses the challenge of expensive and high-risk data exploration inherent in RL by pre-training policies on vast amounts of offline data, enabling direct deployment or fine-tuning in real-world environments. However, this training paradigm can compromise policy robustness, leading to degraded performance in practical conditions due to observation perturbations or intentional attacks. While adversarial attacks and defenses have been extensively studied in deep learning, their application in offline RL is limited. This paper proposes a framework to enhance the robustness of offline RL models by leveraging advanced adversarial attacks and defenses. The framework attacks the actor and critic components by perturbing observations during training and using adversarial defenses as regularization to enhance the learned policy. Four attacks and two defenses are introduced and evaluated on the D4RL benchmark. The results show the vulnerability of both the actor and critic to attacks and the effectiveness of the defenses in improving policy robustness. This framework holds promise for enhancing the reliability of offline RL models in practical scenarios.
翻译:离线强化学习通过在海量离线数据上预训练策略,避免了在线强化学习中数据探索成本高昂和高风险的问题,使策略可直接部署于真实环境或进行微调。然而,这种训练范式可能削弱策略的鲁棒性,导致在实际场景中因观测扰动或恶意攻击而性能退化。尽管对抗攻击与防御在深度学习领域已有广泛研究,其在离线强化学习中的应用仍十分有限。本文提出一种框架,通过利用先进的对抗攻击和防御技术提升离线强化学习模型的鲁棒性。该框架在训练过程中通过对观测施加扰动来攻击演员-评论家组件,并将对抗防御作为正则化手段以增强所学策略。我们引入并评估了四种攻击方法与两种防御策略,并在D4RL基准上进行实验。结果表明,演员与评论家组件均易受攻击影响,而所提防御方法能有效改善策略鲁棒性。该框架为增强离线强化学习模型在实际场景中的可靠性提供了可行方案。