The Shapley value (SV) is a fair and principled metric for contribution evaluation in cross-silo federated learning (cross-silo FL), wherein organizations, i.e., clients, collaboratively train prediction models with the coordination of a parameter server. However, existing SV calculation methods for FL assume that the server can access the raw FL models and public test data. This may not be a valid assumption in practice considering the emerging privacy attacks on FL models and the fact that test data might be clients' private assets. Hence, we investigate the problem of secure SV calculation for cross-silo FL. We first propose HESV, a one-server solution based solely on homomorphic encryption (HE) for privacy protection, which has limitations in efficiency. To overcome these limitations, we propose SecSV, an efficient two-server protocol with the following novel features. First, SecSV utilizes a hybrid privacy protection scheme to avoid ciphertext--ciphertext multiplications between test data and models, which are extremely expensive under HE. Second, an efficient secure matrix multiplication method is proposed for SecSV. Third, SecSV strategically identifies and skips some test samples without significantly affecting the evaluation accuracy. Our experiments demonstrate that SecSV is 7.2-36.6 times as fast as HESV, with a limited loss in the accuracy of calculated SVs.
翻译:沙普利值(SV)是跨安全域联邦学习中一种公平且原则性的贡献评估指标,其中各组织(即客户端)在参数服务器的协调下协同训练预测模型。然而,现有的联邦学习SV计算方法假设服务器能访问原始联邦学习模型和公开测试数据。考虑到新兴的联邦学习模型隐私攻击以及测试数据可能属于客户端的私有资产这一现实,该假设在实践中可能不成立。因此,我们研究了跨安全域联邦学习中的安全SV计算问题。首先提出基于全同态加密的单服务器方案HESV,该方案在隐私保护方面存在效率局限。为克服这些局限,我们提出高效的**两服务器协议SecSV**,其创新特性包括:第一,采用混合隐私保护方案,避免测试数据与模型间代价高昂的密文-密文乘法运算;第二,针对SecSV设计高效安全矩阵乘法方法;第三,策略性识别并跳过部分测试样本,且不影响评估精度。实验表明,SecSV的计算速度较HESV提升7.2-36.6倍,同时计算所得SV的精度损失有限。