Federated learning (FL) has emerged as a prevalent distributed machine learning scheme that enables collaborative model training without aggregating raw data. Cloud service providers further embrace Federated Learning as a Service (FLaaS), allowing data analysts to execute their FL training pipelines over differentially-protected data. Due to the intrinsic properties of differential privacy, the enforced privacy level on data blocks can be viewed as a privacy budget that requires careful scheduling to cater to diverse training pipelines. Existing privacy budget scheduling studies prioritize either efficiency or fairness individually. In this paper, we propose DPBalance, a novel privacy budget scheduling mechanism that jointly optimizes both efficiency and fairness. We first develop a comprehensive utility function incorporating data analyst-level dominant shares and FL-specific performance metrics. A sequential allocation mechanism is then designed using the Lagrange multiplier method and effective greedy heuristics. We theoretically prove that DPBalance satisfies Pareto Efficiency, Sharing Incentive, Envy-Freeness, and Weak Strategy Proofness. We also theoretically prove the existence of a fairness-efficiency tradeoff in privacy budgeting. Extensive experiments demonstrate that DPBalance outperforms state-of-the-art solutions, achieving an average efficiency improvement of $1.44\times \sim 3.49 \times$, and an average fairness improvement of $1.37\times \sim 24.32 \times$.
翻译:联邦学习(FL)作为一种主流分布式机器学习范式,无需聚合原始数据即可实现协作模型训练。云服务提供商进一步发展出联邦学习即服务(FLaaS),使数据分析师能够在差分隐私保护的数据上执行FL训练流程。由于差分隐私的内在特性,数据块上实施的隐私保护级别可视为需要精心调度的隐私预算,以满足多样化的训练流程需求。现有隐私预算调度研究要么优先考虑效率,要么侧重公平性。本文提出DPBalance,一种协同优化效率与公平性的新型隐私预算调度机制。我们首先构建了融合数据分析师级主导份额与FL特定性能指标的综合效用函数,继而采用拉格朗日乘子法与高效贪婪启发式算法设计序贯分配机制。从理论上证明了DPBalance满足帕累托最优、共享激励、无嫉妒性及弱策略证明性,并揭示了隐私预算中公平-效率权衡的存在性。大量实验表明,DPBalance优于现有最优方案,平均效率提升1.44倍至3.49倍,平均公平性提升1.37倍至24.32倍。