The Shapley value (SV) has emerged as a promising method for data valuation. However, computing or estimating the SV is often computationally expensive. To overcome this challenge, Jia et al. (2019) propose an advanced SV estimation algorithm called ``Group Testing-based SV estimator'' which achieves favorable asymptotic sample complexity. In this technical note, we present several improvements in the analysis and design choices of this SV estimator. Moreover, we point out that the Group Testing-based SV estimator does not fully reuse the collected samples. Our analysis and insights contribute to a better understanding of the challenges in developing efficient SV estimation algorithms for data valuation.
翻译:Shapley值(SV)已成为数据估值领域的一种有前景的方法。然而,计算或估计Shapley值通常计算成本高昂。为克服这一挑战,Jia等人(2019)提出了一种先进的SV估计算法——“基于群组测试的SV估计器”,该算法实现了良好的渐近样本复杂度。在本技术注记中,我们对该SV估计器的分析与设计选择提出了若干改进。此外,我们指出基于群组测试的SV估计器未能充分利用已收集的样本。我们的分析与见解有助于更深入地理解开发用于数据估值的高效SV估计算法所面临的挑战。