Attribution scores can be applied in data management to quantify the contribution of individual items to conclusions from the data, as part of the explanation of what led to these conclusions. In Artificial Intelligence, Machine Learning, and Data Management, some of the common scores are deployments of the Shapley value, a formula for profit sharing in cooperative game theory. Since its invention in the 1950s, the Shapley value has been used for contribution measurement in many fields, from economics to law, with its latest researched applications in modern machine learning. Recent studies investigated the application of the Shapley value to database management. This article gives an overview of recent results on the computational complexity of the Shapley value for measuring the contribution of tuples to query answers and to the extent of inconsistency with respect to integrity constraints. More specifically, the article highlights lower and upper bounds on the complexity of calculating the Shapley value, either exactly or approximately, as well as solutions for realizing the calculation in practice.
翻译:归因分数可用于数据管理中,量化单个数据项对结论的贡献度,作为解释这些结论成因的组成部分。在人工智能、机器学习和数据管理领域,常见的分数指标包括沙普利值的应用——这一合作博弈论中的利润分配公式。自20世纪50年代问世以来,沙普利值已被广泛应用于从经济学到法学等多个领域的贡献度评估,其最新研究方向聚焦于现代机器学习。近年来的研究探索了沙普利值在数据库管理中的应用。本文概述了关于沙普利值计算复杂性的最新成果,该值用于衡量元组对查询答案的贡献,以及元组在完整性约束下引发不一致性的程度。具体而言,本文重点阐述了精确或近似计算沙普利值的复杂度上下界,以及实现实际计算的解决方案。