We train language models (LMs) with federated learning (FL) and differential privacy (DP) in the Google Keyboard (Gboard). We apply the DP-Follow-the-Regularized-Leader (DP-FTRL)~\citep{kairouz21b} algorithm to achieve meaningfully formal DP guarantees without requiring uniform sampling of client devices. To provide favorable privacy-utility trade-offs, we introduce a new client participation criterion and discuss the implication of its configuration in large scale systems. We show how quantile-based clip estimation~\citep{andrew2019differentially} can be combined with DP-FTRL to adaptively choose the clip norm during training or reduce the hyperparameter tuning in preparation for training. With the help of pretraining on public data, we train and deploy more than twenty Gboard LMs that achieve high utility and $\rho-$zCDP privacy guarantees with $\rho \in (0.2, 2)$, with two models additionally trained with secure aggregation~\citep{bonawitz2017practical}. We are happy to announce that all the next word prediction neural network LMs in Gboard now have DP guarantees, and all future launches of Gboard neural network LMs will require DP guarantees. We summarize our experience and provide concrete suggestions on DP training for practitioners.
翻译:我们采用联邦学习(FL)与差分隐私(DP)技术在Google键盘(Gboard)中训练语言模型(LMs)。应用DP-Follow-the-Regularized-Leader(DP-FTRL)算法~\citep{kairouz21b},在无需对客户端设备进行均匀采样的情况下,实现了具有实质意义的正式DP保证。为了获得更优的隐私-效用权衡,我们引入了一种新的客户端参与标准,并讨论了其配置在大规模系统中的影响。我们展示了如何将基于分位数的裁剪估计~\citep{andrew2019differentially}与DP-FTRL相结合,在训练过程中自适应选择裁剪范数,或减少训练准备阶段的超参数调优。借助公开数据上的预训练,我们训练并部署了二十多个Gboard语言模型,这些模型在实现高实用性的同时,满足$\rho-$zCDP隐私保证($\rho \in (0.2, 2)$),其中两个模型额外采用了安全聚合训练~\citep{bonawitz2017practical}。我们欣然宣布,Gboard中所有下一词预测神经网络语言模型现已具备DP保证,且未来所有Gboard神经网络语言模型的发布都将要求DP保证。我们总结了实践经验,并为从业者提供了关于DP训练的具体建议。