Federated learning (FL) as one of the novel branches of distributed machine learning (ML), develops global models through a private procedure without direct access to local datasets. However, access to model updates (e.g. gradient updates in deep neural networks) transferred between clients and servers can reveal sensitive information to adversaries. Differential privacy (DP) offers a framework that gives a privacy guarantee by adding certain amounts of noise to parameters. This approach, although being effective in terms of privacy, adversely affects model performance due to noise involvement. Hence, it is always needed to find a balance between noise injection and the sacrificed accuracy. To address this challenge, we propose adaptive noise addition in FL which decides the value of injected noise based on features' relative importance. Here, we first propose two effective methods for prioritizing features in deep neural network models and then perturb models' weights based on this information. Specifically, we try to figure out whether the idea of adding more noise to less important parameters and less noise to more important parameters can effectively save the model accuracy while preserving privacy. Our experiments confirm this statement under some conditions. The amount of noise injected, the proportion of parameters involved, and the number of global iterations can significantly change the output. While a careful choice of parameters by considering the properties of datasets can improve privacy without intense loss of accuracy, a bad choice can make the model performance worse.
翻译:联邦学习作为分布式机器学习的新兴分支,通过私有流程开发全局模型,无需直接访问本地数据集。然而,客户端与服务器之间传输的模型更新(如深度神经网络中的梯度更新)可能向攻击者泄露敏感信息。差分隐私通过向参数添加特定噪声提供隐私保障框架。尽管该方法在隐私保护方面有效,但噪声引入会损害模型性能。因此,始终需要在噪声注入与牺牲的准确性之间寻找平衡。为解决这一挑战,我们提出联邦学习中的自适应噪声添加方法,该方法根据特征相对重要性决定注入噪声的值。本文首先提出两种针对深度神经网络模型特征优先级的有效排序方法,随后基于此信息扰动模型权重。具体而言,我们试图验证:对较不重要参数添加更多噪声,而对较重要参数添加较少噪声,能否在保护隐私的同时有效保持模型精度。实验在特定条件下证实了这一论断。注入噪声量、涉及参数比例及全局迭代次数会显著改变输出结果。通过考虑数据集特性谨慎选择参数,可在不显著损失精度的情况下提升隐私保护,而参数选择不当则会导致模型性能恶化。