We train language models (LMs) with federated learning (FL) and differential privacy (DP) in the Google Keyboard (Gboard). We apply the DP-Follow-the-Regularized-Leader (DP-FTRL)~\citep{kairouz21b} algorithm to achieve meaningfully formal DP guarantees without requiring uniform sampling of client devices. To provide favorable privacy-utility trade-offs, we introduce a new client participation criterion and discuss the implication of its configuration in large scale systems. We show how quantile-based clip estimation~\citep{andrew2019differentially} can be combined with DP-FTRL to adaptively choose the clip norm during training or reduce the hyperparameter tuning in preparation for training. With the help of pretraining on public data, we train and deploy more than twenty Gboard LMs that achieve high utility and $\rho-$zCDP privacy guarantees with $\rho \in (0.2, 2)$, with two models additionally trained with secure aggregation~\citep{bonawitz2017practical}. We are happy to announce that all the next word prediction neural network LMs in Gboard now have DP guarantees, and all future launches of Gboard neural network LMs will require DP guarantees. We summarize our experience and provide concrete suggestions on DP training for practitioners.
翻译:我们采用联邦学习(FL)和差分隐私(DP)技术在Google键盘(Gboard)中训练语言模型(LMs)。通过应用DP-Follow-the-Regularized-Leader(DP-FTRL)算法~\citep{kairouz21b},我们在无需对客户端设备进行均匀采样的前提下实现了有意义的正式DP保证。为获得更优的隐私-效用权衡,我们引入了一种新的客户端参与准则,并讨论了其配置在大规模系统中的影响。我们展示了如何将基于分位数的裁剪估计~\citep{andrew2019differentially}与DP-FTRL相结合,从而在训练过程中自适应选择裁剪范数,或减少训练准备阶段的超参数调优。借助在公开数据上的预训练,我们训练并部署了超过二十个Gboard语言模型,这些模型在实现高效用性能的同时,满足$\rho-$zCDP隐私保证($\rho \in (0.2, 2)$),其中两个模型还额外采用安全聚合技术~\citep{bonawitz2017practical}进行训练。我们荣幸宣布:Gboard中所有基于神经网络的下一个词预测语言模型现已具备DP保证,且未来所有Gboard神经网络语言模型的发布都将要求具备DP保证。我们总结了实践经验,并为从业者提供了关于DP训练的具体建议。