The conventional machine learning (ML) and deep learning approaches need to share customers' sensitive information with an external credit bureau to generate a prediction model that opens the door to privacy leakage. This leakage risk makes financial companies face an enormous challenge in their cooperation. Federated learning is a machine learning setting that can protect data privacy, but the high communication cost is often the bottleneck of the federated systems, especially for large neural networks. Limiting the number and size of communications is necessary for the practical training of large neural structures. Gradient sparsification has received increasing attention as a method to reduce communication cost, which only updates significant gradients and accumulates insignificant gradients locally. However, the secure aggregation framework cannot directly use gradient sparsification. This article proposes two sparsification methods to reduce communication cost in federated learning. One is a time-varying hierarchical sparsification method for model parameter update, which solves the problem of maintaining model accuracy after high ratio sparsity. It can significantly reduce the cost of a single communication. The other is to apply the sparsification method to the secure aggregation framework. We sparse the encryption mask matrix to reduce the cost of communication while protecting privacy. Experiments show that under different Non-IID experiment settings, our method can reduce the upload communication cost to about 2.9% to 18.9% of the conventional federated learning algorithm when the sparse rate is 0.01.
翻译:传统的机器学习(ML)和深度学习方法需将客户敏感信息共享给外部信用机构以生成预测模型,这带来了隐私泄露的风险。这种泄露风险使金融公司在合作中面临巨大挑战。联邦学习作为一种能够保护数据隐私的机器学习框架,但其高通信成本往往是联邦系统的瓶颈,尤其是在处理大型神经网络时。为实际训练大型神经结构,必须限制通信次数与数据量。梯度稀疏化作为降低通信成本的方法日益受到关注,该方法仅更新重要梯度,并将非重要梯度在本地累积。然而,安全聚合框架无法直接应用梯度稀疏化。本文提出两种稀疏化方法以降低联邦学习中的通信成本:其一为面向模型参数更新的时变分层稀疏化方法,解决了高比例稀疏后模型精度的保持问题,可大幅降低单次通信成本;其二为将稀疏化方法应用于安全聚合框架,通过对加密掩码矩阵进行稀疏化,在保护隐私的同时降低通信成本。实验表明,在多种非独立同分布(Non-IID)实验设置下,当稀疏率为0.01时,所提方法可将上传通信成本降至传统联邦学习算法的约2.9%至18.9%。