Empowering safe exploration of reinforcement learning (RL) agents during training is a critical impediment towards deploying RL agents in many real-world scenarios. Training RL agents in unknown, black-box environments poses an even greater safety risk when prior knowledge of the domain/task is unavailable. We introduce ADVICE (Adaptive Shielding with a Contrastive Autoencoder), a novel post-shielding technique that distinguishes safe and unsafe features of state-action pairs during training, thus protecting the RL agent from executing actions that yield potentially hazardous outcomes. Our comprehensive experimental evaluation against state-of-the-art safe RL exploration techniques demonstrates how ADVICE can significantly reduce safety violations during training while maintaining a competitive outcome reward.
翻译:在训练过程中保障强化学习智能体的安全探索,是将其部署于众多现实场景的关键障碍。当缺乏领域/任务的先验知识时,在未知的黑盒环境中训练强化学习智能体会带来更大的安全风险。本文提出ADVICE(基于对比自编码器的自适应屏蔽),一种新颖的后屏蔽技术,该技术能在训练过程中区分状态-动作对的安全与危险特征,从而保护强化学习智能体免于执行可能导致危险后果的动作。我们针对最先进的安全强化学习探索技术进行了全面的实验评估,结果表明ADVICE能在保持具有竞争力的结果奖励的同时,显著减少训练期间的安全违规。