Reinforcement learning agents deployed in the real world often have to cope with partially observable environments. Therefore, most agents employ memory mechanisms to approximate the state of the environment. Recently, there have been impressive success stories in mastering partially observable environments, mostly in the realm of computer games like Dota 2, StarCraft II, or MineCraft. However, none of these methods are interpretable in the sense that it is not comprehensible for humans how the agent decides which actions to take based on its inputs. Yet, human understanding is necessary in order to deploy such methods in high-stake domains like autonomous driving or medical applications. We propose a novel memory mechanism that operates on human language to illuminate the decision-making process. First, we use CLIP to associate visual inputs with language tokens. Then we feed these tokens to a pretrained language model that serves the agent as memory and provides it with a coherent and interpretable representation of the past. Our memory mechanism achieves state-of-the-art performance in environments where memorizing the past is crucial to solve tasks. Further, we present situations where our memory component excels or fails to demonstrate strengths and weaknesses of our new approach.
翻译:部署于真实世界的强化学习智能体通常需要应对部分可观测环境。因此,大多数智能体采用记忆机制来近似环境状态。近期,在部分可观测环境(主要涉及《Dota 2》《星际争霸II》《我的世界》等电子游戏领域)中已取得瞩目成果,但这些方法均缺乏可解释性——人类无法理解智能体如何根据输入决定其行动策略。然而,在自动驾驶或医疗应用等高风险领域部署此类方法时,人类理解至关重要。我们提出一种基于人类语言的新型记忆机制,以阐明决策过程。首先,利用CLIP将视觉输入与语言标记关联;随后,将这些标记输入预训练语言模型,该模型作为智能体的记忆模块,为其提供连贯且可解释的过去状态表征。在依赖历史记忆求解任务的环境中,我们的记忆机制达到了当前最优性能。此外,我们通过展示该记忆组件的成功与失败案例,揭示了新方法的优势与局限。