Explaining the behaviour of intelligent agents learned by reinforcement learning (RL) to humans is challenging yet crucial due to their incomprehensible proprioceptive states, variational intermediate goals, and resultant unpredictability. Moreover, one-step explanations for RL agents can be ambiguous as they fail to account for the agent's future behaviour at each transition, adding to the complexity of explaining robot actions. By leveraging abstracted actions that map to task-specific primitives, we avoid explanations on the movement level. To further improve the transparency and explainability of robotic systems, we propose an explainable Q-Map learning framework that combines reward decomposition (RD) with abstracted action spaces, allowing for non-ambiguous and high-level explanations based on object properties in the task. We demonstrate the effectiveness of our framework through quantitative and qualitative analysis of two robotic scenarios, showcasing visual and textual explanations, from output artefacts of RD explanations, that are easy for humans to comprehend. Additionally, we demonstrate the versatility of integrating these artefacts with large language models (LLMs) for reasoning and interactive querying.
翻译:强化学习(RL)习得的智能体行为因其难以理解的本体感知状态、变化中的中间目标及由此产生的不确定性,使得向人类解释这些行为既具挑战性又至关重要。此外,针对RL智能体的单步解释可能含糊不清,因其无法在每次状态转移中预测智能体的未来行为,这进一步增加了机器人行为解释的复杂性。通过利用映射到任务特定原语的抽象化动作,我们避免了在运动层级进行解释。为进一步提升机器人系统的透明性和可解释性,我们提出了一种可解释的Q-Map学习框架,该框架将奖励分解与抽象动作空间相结合,能够基于任务中的物体属性生成无歧义的高层解释。通过对两个机器人场景的定量和定性分析,我们验证了框架的有效性,展示了基于奖励分解输出产物生成的视觉与文本解释,这些解释易于人类理解。此外,我们还证明了将这些产物与大语言模型(LLMs)集成,可用于推理和交互式查询的通用性。