While Goal-Conditioned Reinforcement Learning (GCRL) has gained attention, its algorithmic robustness against adversarial perturbations remains unexplored. The attacks and robust representation training methods that are designed for traditional RL become less effective when applied to GCRL. To address this challenge, we first propose the Semi-Contrastive Representation attack, a novel approach inspired by the adversarial contrastive attack. Unlike existing attacks in RL, it only necessitates information from the policy function and can be seamlessly implemented during deployment. Then, to mitigate the vulnerability of existing GCRL algorithms, we introduce Adversarial Representation Tactics, which combines Semi-Contrastive Adversarial Augmentation with Sensitivity-Aware Regularizer to improve the adversarial robustness of the underlying RL agent against various types of perturbations. Extensive experiments validate the superior performance of our attack and defence methods across multiple state-of-the-art GCRL algorithms. Our tool ReRoGCRL is available at https://github.com/TrustAI/ReRoGCRL.
翻译:尽管目标条件强化学习(GCRL)已受到广泛关注,但其算法在面对对抗性扰动时的鲁棒性尚未被探索。针对传统强化学习设计的攻击与鲁棒表示训练方法在应用于GCRL时效果显著下降。为应对这一挑战,我们首先提出半对比表示攻击(Semi-Contrastive Representation Attack),这是一种受对抗性对比攻击启发的新方法。与现有强化学习攻击不同,该方法仅需利用策略函数信息,并可在部署阶段无缝实施。随后,为缓解现有GCRL算法的脆弱性,我们引入对抗性表示策略(Adversarial Representation Tactics),该策略结合半对比对抗性增强(Semi-Contrastive Adversarial Augmentation)与敏感感知正则化器(Sensitivity-Aware Regularizer),提升底层强化学习代理对多种类型扰动的对抗鲁棒性。大量实验验证了我们的攻击与防御方法在多种先进GCRL算法上的优越性能。我们的工具ReRoGCRL已在https://github.com/TrustAI/ReRoGCRL开源。