One of the main challenges for interpreting black-box models is the ability to uniquely decompose square-integrable functions of non-mutually independent random inputs into a sum of functions of every possible subset of variables. However, dealing with dependencies among inputs can be complicated. We propose a novel framework to study this problem, linking three domains of mathematics: probability theory, functional analysis, and combinatorics. We show that, under two reasonable assumptions on the inputs (non-perfect functional dependence and non-degenerate stochastic dependence), it is always possible to decompose uniquely such a function. This ``canonical decomposition'' is relatively intuitive and unveils the linear nature of non-linear functions of non-linearly dependent inputs. In this framework, we effectively generalize the well-known Hoeffding decomposition, which can be seen as a particular case. Oblique projections of the black-box model allow for novel interpretability indices for evaluation and variance decomposition. Aside from their intuitive nature, the properties of these novel indices are studied and discussed. This result offers a path towards a more precise uncertainty quantification, which can benefit sensitivity analyses and interpretability studies, whenever the inputs are dependent. This decomposition is illustrated analytically, and the challenges to adopting these results in practice are discussed.
翻译:解释黑箱模型的主要挑战之一,在于如何将非相互独立随机输入的函数唯一分解为所有可能变量子集函数之和。然而处理输入间的依赖关系可能较为复杂。我们提出一个新颖的框架来研究这一问题,该框架连接了概率论、泛函分析和组合数学三个数学领域。我们证明,在输入满足两个合理假设(非完美函数依赖与非退化随机依赖)的条件下,总能唯一分解此类函数。这种"规范分解"相对直观,并揭示了非线性依赖输入的非线性函数所具有的线性本质。在此框架下,我们有效推广了著名的Hoeffding分解,后者可视为本文的特例。黑箱模型的斜投影为评估和方差分解提供了新的可解释性指标。除了直观特性外,我们还研究并讨论了这些新指标的性质。该结果为更精确的不确定性量化提供了途径,在输入存在依赖关系时,可惠及敏感性分析和可解释性研究。本文通过解析示例阐明此分解,并讨论在实际中应用这些成果所面临的挑战。