Explainability plays an increasingly important role in machine learning. Furthermore, humans view the world through a causal lens and thus prefer causal explanations over associational ones. Therefore, in this paper, we develop a causal explanation mechanism that quantifies the causal importance of states on actions and such importance over time. We also demonstrate the advantages of our mechanism over state-of-the-art associational methods in terms of RL policy explanation through a series of simulation studies, including crop irrigation, Blackjack, collision avoidance, and lunar lander.
翻译:可解释性在机器学习中发挥着日益重要的作用。此外,人类通过因果视角观察世界,因此更偏好因果解释而非关联性解释。基于此,本文提出了一种因果解释机制,能够量化状态对动作的因果重要性及其随时间变化的动态特征。通过一系列仿真实验(包括农作物灌溉、黑杰克牌局、碰撞规避及月球着陆器场景),我们验证了该机制相较于当前最先进的关联性方法在强化学习策略解释方面的优势。