Federated learning enables collaborative model training across distributed institutions without centralizing sensitive data; however, ensuring algorithmic fairness across heterogeneous data distributions while preserving privacy remains fundamentally unresolved. This paper introduces CryptoFair-FL, a novel cryptographic framework providing the first verifiable fairness guarantees for federated learning systems under formal security definitions. The proposed approach combines additively homomorphic encryption with secure multi-party computation to enable privacy-preserving verification of demographic parity and equalized odds metrics without revealing protected attribute distributions or individual predictions. A novel batched verification protocol reduces computational complexity from BigO(n^2) to BigO(n \log n) while maintaining (\dparam, \deltap)-differential privacy with dparam = 0.5 and deltap = 10^{-6}. Theoretical analysis establishes information-theoretic lower bounds on the privacy cost of fairness verification, demonstrating that the proposed protocol achieves near-optimal privacy-fairness tradeoffs. Comprehensive experiments across four benchmark datasets (MIMIC-IV healthcare records, Adult Income, CelebA, and a novel FedFair-100 benchmark) demonstrate that CryptoFair-FL reduces fairness violations from 0.231 to 0.031 demographic parity difference while incurring only 2.3 times computational overhead compared to standard federated averaging. The framework successfully defends against attribute inference attacks, maintaining adversarial success probability below 0.05 across all tested configurations. These results establish a practical pathway for deploying fairness-aware federated learning in regulated industries requiring both privacy protection and algorithmic accountability.
翻译:联邦学习使得分布式机构能够在不集中敏感数据的情况下进行协作模型训练;然而,在保护隐私的同时确保跨异构数据分布的算法公平性,仍然是一个尚未根本解决的难题。本文提出了CryptoFair-FL,一种新颖的密码学框架,为联邦学习系统在形式化安全定义下提供了首个可验证的公平性保证。该方法结合了加法同态加密与安全多方计算,使得能够在保护隐私的前提下,对人口统计均等和机会均等等公平性指标进行验证,而无需暴露受保护属性的分布或个体预测结果。一种新颖的批量验证协议将计算复杂度从O(n²)降低至O(n log n),同时保持(ε, δ)-差分隐私,其中ε = 0.5,δ = 10⁻⁶。理论分析建立了公平性验证隐私代价的信息论下界,证明了所提协议实现了接近最优的隐私-公平权衡。在四个基准数据集(MIMIC-IV医疗记录、Adult Income、CelebA以及新颖的FedFair-100基准)上的综合实验表明,与标准的联邦平均算法相比,CryptoFair-FL将人口统计均等差异从0.231降低至0.031,同时仅产生2.3倍的计算开销。该框架成功抵御了属性推断攻击,在所有测试配置下将对抗成功率维持在0.05以下。这些结果为在需要隐私保护和算法问责的受监管行业中部署公平感知的联邦学习,确立了一条可行的实践路径。