Reinforcement Learning (RL) systems can be complex and non-interpretable, making it challenging for non-AI experts to understand or intervene in their decisions. This is due, in part, to the sequential nature of RL in which actions are chosen because of future rewards. However, RL agents discard the qualitative features of their training, making it hard to recover user-understandable information for "why" an action is chosen. Proposed sentence chunking: We propose a technique Experiential Explanations to generate counterfactual explanations by training influence predictors alongside the RL policy. Influence predictors are models that learn how sources of reward affect the agent in different states, thus restoring information about how the policy reflects the environment. A human evaluation study revealed that participants presented with experiential explanations were better able to correctly guess what an agent would do than those presented with other standard types of explanations. Participants also found experiential explanations to be more understandable, satisfying, complete, useful, and accurate. The qualitative analysis provides insights into the factors of experiential explanations that find most useful.
翻译:强化学习系统可能复杂且不可解释,这使得非人工智能专家难以理解或干预其决策。部分原因在于强化学习的序列特性——智能体因未来奖励而选择动作。然而,强化学习智能体丢弃了训练过程中的定性特征,这使得恢复用户可理解的"为何选择某个动作"的信息变得困难。我们提出一种名为经验性解释的技术,通过在与强化学习策略并行训练影响预测器的同时生成反事实解释。影响预测器是一种学习奖励源如何在不同状态下影响智能体的模型,从而恢复策略如何反映环境的信息。一项人工评估研究表明,与观看其他标准类型解释的参与者相比,观看经验性解释的参与者能够更准确地猜测智能体的行为。参与者还认为经验性解释更易理解、更令人满意、更完整、更有用且更准确。定性分析揭示了经验性解释中最具效用的关键因素。