Improved Differentially Private Continual Observation Using Group Algebra

Differentially private weighted prefix sum under continual observation is a crucial component in the production-level deployment of private next-word prediction for Gboard, which, according to Google, has over a billion users. More specifically, Google uses a differentially private mechanism to sum weighted gradients in its \emph{private follow-the-regularized leader} algorithm. Apart from efficiency, the additive error of the private mechanism is crucial as multiplied with the square root of the model's dimension $d$ (with $d$ ranging up to $10$ trillion, for example, Switch Transformers or M6-10T), it determines the accuracy of the learning system. So, any improvement in leading constant matters significantly in practice. In this paper, we show a novel connection between mechanisms for continual weighted prefix sum and a concept in representation theory known as the group matrix introduced in correspondence between Dedekind and Frobenius (1897) and generalized by Schur (1904). To the best of our knowledge, this is the first application of group algebra to analyze differentially private algorithms. Using this connection, we analyze a class of matrix norms known as {\em factorization norms} that give upper and lower bounds for the additive error under general $\ell_p$-norms of the matrix mechanism. This allows us to give the first efficient factorization that matches the best-known non-constructive upper bound on the factorization norm by Mathias (1993) for the matrix used in Google's deployment and also improves on the previous best-known constructive bound of Fichtenberger et al. (ICML 2023) and Henzinger et al. (SODA 2023) and the first upper bound on the additive error for a large class of weight functions for weighted prefix sum problems, including the sliding window matrix (Bolot et al. (ICDT 2013).

翻译：差分隐私下的持续观测加权前缀和是Gboard私有下一词预测生产级部署的关键组件，据谷歌称，该应用拥有超过十亿用户。具体而言，谷歌在其\emph{私有跟随正则化领导者}算法中使用差分隐私机制对加权梯度进行求和。除效率外，该隐私机制的加性误差至关重要——当与模型维度$d$的平方根相乘时（例如Switch Transformers或M6-10T等模型的$d$可达十万亿量级），该误差决定了学习系统的精度。因此，主导常数的任何改进在实践中都具有重要意义。本文揭示了持续加权前缀和机制与表示理论中群矩阵概念之间的新颖联系，该概念源于Dedekind与Frobenius（1897）的通信记录，并由Schur（1904）推广。据我们所知，这是群代数在差分隐私算法分析中的首次应用。基于此关联，我们分析了一类称为{\em 分解范数}的矩阵范数，该范数为矩阵机制在一般$\ell_p$范数下的加性误差提供了上下界。这使得我们首次给出了高效分解方案，该方案匹配了Mathias（1993）针对谷歌部署所用矩阵提出的分解范数最佳已知非构造性上界，同时改进了Fichtenberger等人（ICML 2023）与Henzinger等人（SODA 2023）先前的最佳已知构造性界限，并为包括滑动窗口矩阵（Bolot等人，ICDT 2013）在内的加权前缀和问题中一大类权重函数首次建立了加性误差上界。