Explanation generation frameworks aim to make AI systems' decisions transparent and understandable to human users. However, generating explanations in uncertain environments characterized by incomplete information and probabilistic models remains a significant challenge. In this paper, we propose a novel framework for generating probabilistic monolithic explanations and model reconciling explanations. Monolithic explanations provide self-contained reasons for an explanandum without considering the agent receiving the explanation, while model reconciling explanations account for the knowledge of the agent receiving the explanation. For monolithic explanations, our approach integrates uncertainty by utilizing probabilistic logic to increase the probability of the explanandum. For model reconciling explanations, we propose a framework that extends the logic-based variant of the model reconciliation problem to account for probabilistic human models, where the goal is to find explanations that increase the probability of the explanandum while minimizing conflicts between the explanation and the probabilistic human model. We introduce explanatory gain and explanatory power as quantitative metrics to assess the quality of these explanations. Further, we present algorithms that exploit the duality between minimal correction sets and minimal unsatisfiable sets to efficiently compute both types of explanations in probabilistic contexts. Extensive experimental evaluations on various benchmarks demonstrate the effectiveness and scalability of our approach in generating explanations under uncertainty.
翻译:解释生成框架旨在使人工智能系统的决策对人类用户透明且易于理解。然而,在信息不完整和概率模型为特征的不确定环境中生成解释仍是一个重大挑战。本文提出了一种新颖的框架,用于生成概率性单一解释与模型调和解释。单一解释为待解释对象提供自足的理由,而不考虑接收解释的智能体;而模型调和解释则考虑接收解释智能体的知识状态。对于单一解释,我们的方法通过运用概率逻辑来提高待解释对象的概率,从而整合不确定性。对于模型调和解释,我们提出了一个框架,将基于逻辑的模型调和问题变体扩展至概率化人类模型,其目标是寻找既能提高待解释对象概率,又能最小化解释与概率化人类模型之间冲突的解释方案。我们引入解释增益与解释解释力作为量化指标,以评估这些解释的质量。此外,我们提出了利用最小校正集与最小不可满足集之间对偶性的算法,以高效计算概率上下文中的两类解释。在各种基准测试上的大量实验评估表明,我们的方法在不确定性下生成解释具有高效性与可扩展性。