Explaining why aggregated measures change is a critical challenge in data analytics that existing systems struggle to address. While current attribution methods exist, they lack a unified solution that is simultaneously general for arbitrary measures, holistic across both data dimensions and measure composition, and rigorous in its interpretability. To bridge this gap, we introduce a principled framework that reframes attribution through the powerful lens of cooperative game theory. Our key contribution is a classification of measures based on their mathematical structure, which enables a spectrum of algorithms-from general approximations to exact, closed-form solutions-that offer a principled trade-off between generality and performance. We demonstrate our framework's superiority through a multi-faceted evaluation: simulations first confirm its numerical accuracy and then its generality for non-additive measures; a case study on Simpson's Paradox showcases its unique interpretability; and a final experiment proves its practical utility by significantly outperforming existing root cause analysis systems.
翻译:解释聚合度量为何变化是数据分析中的关键挑战,现有系统难以妥善应对。尽管当前存在归因方法,但缺乏统一的解决方案:既能在任意度量上具有通用性,又能跨数据维度和度量组成实现整体性分析,同时具备严格的可解释性。为弥合这一鸿沟,我们提出一套基于合作博弈论视角重构归因问题的原理框架。核心贡献在于根据度量的数学结构对其进行分类,从而衍生出从通用近似到精确闭式解的算法谱系,实现通用性与性能之间的原则性权衡。通过多维度评估验证框架的优越性:模拟实验首先证实其数值精度,继而展示对非可加度量的通用性;关于辛普森悖论的案例研究凸显其独特可解释性;最终实验通过显著优于现有根因分析系统,证明其实用价值。