Differentially private federated learning (DP-FL) often suffers from slow convergence under tight privacy budgets because the noise required for privacy preservation degrades gradient quality. Although second-order optimization can accelerate training, existing approaches for DP-FL face significant scalability limitations: Newton-type methods require clients to compute Hessians, while feature covariance methods scale poorly with model dimension. We propose DP-FedSOFIM, a simple and scalable second-order optimization method for DP-FL. The method constructs an online regularized proxy for the Fisher information matrix at the server using only privatized aggregated gradients, capturing useful curvature information without requiring Hessian computations or feature covariance estimation. Efficient rank-one updates based on the Sherman-Morrison formula enable communication costs proportional to the model size and require only O(d) client-side memory. Because all curvature and preconditioning operations are performed at the server on already privatized gradients, DP-FedSOFIM introduces no additional privacy cost beyond the underlying privatized gradient release mechanism. Experiments on CIFAR-10 and PathMNIST show that DP-FedSOFIM converges faster and consistently achieves higher accuracy than DP-FedGD, DP-SCAFFOLD, and DP-FedFC across a range of privacy budgets, with particularly pronounced gains under stringent privacy constraints.
翻译:差分隐私联邦学习(DP-FL)在严格的隐私预算下常面临收敛缓慢的问题,这是因为隐私保护所需的噪声降低了梯度质量。尽管二阶优化能够加速训练,但现有的DP-FL方法存在显著的可扩展性限制:牛顿型方法要求客户端计算Hessian矩阵,而特征协方差方法在处理高维模型时扩展性较差。我们提出DP-FedSOFIM——一种适用于DP-FL的简单且可扩展的二阶优化方法。该方法仅利用服务器端已聚合的私有化梯度,构建Fisher信息矩阵的在线正则化代理,从而在不需计算Hessian矩阵或估计特征协方差的前提下捕获有用的曲率信息。基于Sherman-Morrison公式的高效秩一更新机制,使通信开销与模型规模成比例,且客户端仅需O(d)的内存。由于所有曲率计算和预条件操作均在服务器端针对已私有化的梯度执行,DP-FedSOFIM除了基础的私有化梯度释放机制外,不引入额外隐私成本。在CIFAR-10和PathMNIST数据集上的实验表明,在不同的隐私预算范围内,DP-FedSOFIM的收敛速度均快于DP-FedGD、DP-SCAFFOLD和DP-FedFC,且始终获得更高的准确率,尤其在严格隐私约束下优势更为显著。