The Shapley value (SV) is a fair and principled metric for contribution evaluation in cross-silo federated learning (cross-silo FL), wherein organizations, i.e., clients, collaboratively train prediction models with the coordination of a parameter server. However, existing SV calculation methods for FL assume that the server can access the raw FL models and public test data. This may not be a valid assumption in practice considering the emerging privacy attacks on FL models and the fact that test data might be clients' private assets. Hence, we investigate the problem of secure SV calculation for cross-silo FL. We first propose HESV, a one-server solution based solely on homomorphic encryption (HE) for privacy protection, which has limitations in efficiency. To overcome these limitations, we propose SecSV, an efficient two-server protocol with the following novel features. First, SecSV utilizes a hybrid privacy protection scheme to avoid ciphertext--ciphertext multiplications between test data and models, which are extremely expensive under HE. Second, an efficient secure matrix multiplication method is proposed for SecSV. Third, SecSV strategically identifies and skips some test samples without significantly affecting the evaluation accuracy. Our experiments demonstrate that SecSV is 7.2-36.6 times as fast as HESV, with a limited loss in the accuracy of calculated SVs.
翻译:沙普利值(SV)是跨孤岛联邦学习(cross-silo FL)中用于贡献评估的一种公平且原则性强的度量指标,其中各组织(即客户端)在参数服务器的协调下协同训练预测模型。然而,现有的联邦学习SV计算方法假设服务器可以访问原始FL模型和公开测试数据。考虑到针对FL模型的新兴隐私攻击以及测试数据可能属于客户端私有资产这一事实,这种假设在实践中可能不成立。因此,我们研究了跨孤岛FL中安全SV计算的问题。我们首先提出HESV,一种仅基于同态加密(HE)进行隐私保护的单服务器方案,但该方案在效率上存在局限性。为克服这些局限性,我们提出SecSV,一种高效的双服务器协议,具有以下创新特性。第一,SecSV采用混合隐私保护方案,避免测试数据与模型之间的密文-密文乘法运算(此类运算在同态加密下极其昂贵)。第二,为SecSV设计了高效的安全矩阵乘法方法。第三,SecSV策略性地识别并跳过部分测试样本,而不会显著影响评估准确性。实验表明,SecSV的速度是HESV的7.2至36.6倍,且计算SV的精度损失有限。