Machine learning systems have been extensively used as auxiliary tools in domains that require critical decision-making, such as healthcare and criminal justice. The explainability of decisions is crucial for users to develop trust on these systems. In recent years, the globally-consistent rule-based summary-explanation and its max-support (MS) problem have been proposed, which can provide explanations for particular decisions along with useful statistics of the dataset. However, globally-consistent summary-explanations with limited complexity typically have small supports, if there are any. In this paper, we propose a relaxed version of summary-explanation, i.e., the $q$-consistent summary-explanation, which aims to achieve greater support at the cost of slightly lower consistency. The challenge is that the max-support problem of $q$-consistent summary-explanation (MSqC) is much more complex than the original MS problem, resulting in over-extended solution time using standard branch-and-bound solvers. To improve the solution time efficiency, this paper proposes the weighted column sampling~(WCS) method based on solving smaller problems by sampling variables according to their simplified increase support (SIS) values. Experiments verify that solving MSqC with the proposed SIS-based WCS method is not only more scalable in efficiency, but also yields solutions with greater support and better global extrapolation effectiveness.
翻译:机器学习系统已被广泛应用于医疗和刑事司法等需要关键决策的领域。决策的可解释性对用户建立对系统的信任至关重要。近年来,全局一致的基于规则的摘要-解释及其最大支持度(MS)问题被提出,能够为特定决策提供解释,同时附带数据集的有用统计信息。然而,复杂度有限的全局一致摘要-解释通常具有较小的支持度(如果存在)。本文提出一种更松弛的摘要-解释版本,即 $q$-一致摘要-解释,旨在以略微降低一致性为代价获得更大的支持度。挑战在于,$q$-一致摘要-解释的最大支持度问题(MSqC)比原始MS问题复杂得多,使用标准分支定界求解器会导致求解时间过长。为提升求解时间效率,本文提出加权列采样(WCS)方法,该方法根据变量的简化增量支持度(SIS)值进行变量采样,从而求解更小规模的问题。实验证明,使用基于SIS的WCS方法求解MSqC不仅在效率上具有更好的可扩展性,而且能获得支持度更大、全局外推效果更优的解决方案。