We present CEMA: Causal Explanations in Multi-Agent systems; a framework for creating causal natural language explanations of an agent's decisions in dynamic sequential multi-agent systems to build more trustworthy autonomous agents. Unlike prior work that assumes a fixed causal structure, CEMA only requires a probabilistic model for forward-simulating the state of the system. Using such a model, CEMA simulates counterfactual worlds that identify the salient causes behind the agent's decisions. We evaluate CEMA on the task of motion planning for autonomous driving and test it in diverse simulated scenarios. We show that CEMA correctly and robustly identifies the causes behind the agent's decisions, even when a large number of other agents is present, and show via a user study that CEMA's explanations have a positive effect on participants' trust in autonomous vehicles and are rated as high as high-quality baseline explanations elicited from other participants. We release the collected explanations with annotations as the HEADD dataset.
翻译:我们提出CEMA:多智能体系统中的因果解释框架,该框架为动态序贯多智能体系统中智能体的决策生成因果自然语言解释,旨在构建更可信的自主智能体。与依赖固定因果结构的既有研究不同,CEMA仅需一个用于正向模拟系统状态的概率模型。基于该模型,CEMA模拟反事实世界,以识别智能体决策背后的关键成因。我们在自动驾驶运动规划任务上评估CEMA,并在多种仿真场景中进行测试。结果表明,即使存在大量其他智能体,CEMA仍能正确且稳健地识别智能体决策的成因;用户研究显示,CEMA生成的解释能积极提升参与者对自动驾驶汽车的信任度,其质量评分与由其他参与者生成的高质量基线解释相当。我们将收集的解释及其标注作为HEADD数据集公开发布。