Streaming Private Continual Counting via Binning

In differential privacy, $\textit{continual observation}$ refers to problems in which we wish to continuously release a function of a dataset that is revealed one element at a time. The challenge is to maintain a good approximation while keeping the combined output over all time steps differentially private. In the special case of $\textit{continual counting}$ we seek to approximate a sum of binary input elements. This problem has received considerable attention lately, in part due to its relevance in implementations of differentially private stochastic gradient descent. $\textit{Factorization mechanisms}$ are the leading approach to continual counting, but the best such mechanisms do not work well in $\textit{streaming}$ settings since they require space proportional to the size of the input. In this paper, we present a simple approach to approximating factorization mechanisms in low space via $\textit{binning}$, where adjacent matrix entries with similar values are changed to be identical in such a way that a matrix-vector product can be maintained in sublinear space. Our approach has provable sublinear space guarantees for a class of lower triangular matrices whose entries are monotonically decreasing away from the diagonal. We show empirically that even with very low space usage we are able to closely match, and sometimes surpass, the performance of asymptotically optimal factorization mechanisms. Recently, and independently of our work, Dvijotham et al. have also suggested an approach to implementing factorization mechanisms in a streaming setting. Their work differs from ours in several respects: It only addresses factorization into $\textit{Toeplitz}$ matrices, only considers $\textit{maximum}$ error, and uses a different technique based on rational function approximation that seems less versatile than our binning approach.

翻译：在差分隐私中，$\textit{持续观测}$ 指的是我们希望持续发布一个关于数据集的函数，而该数据集每次只揭示一个元素的问题。其挑战在于，在保持所有时间步长上的组合输出满足差分隐私的同时，维持一个良好的近似。在 $\textit{持续计数}$ 这一特例中，我们寻求近似一个二进制输入元素的总和。这个问题最近受到了相当大的关注，部分原因在于它在差分隐私随机梯度下降实现中的相关性。$\textit{分解机制}$ 是持续计数的主要方法，但此类最佳机制在 $\textit{流式}$ 场景中效果不佳，因为它们需要与输入大小成比例的空间。在本文中，我们提出了一种通过 $\textit{分箱}$ 在低空间下近似分解机制的简单方法，其中将具有相似值的相邻矩阵条目更改为相同，使得矩阵-向量积可以在亚线性空间中维持。我们的方法对于一类下三角矩阵（其条目随着远离对角线而单调递减）具有可证明的亚线性空间保证。我们通过实验表明，即使使用非常低的空间，我们也能紧密匹配，有时甚至超越渐近最优分解机制的性能。最近，Dvijotham 等人独立于我们的工作，也提出了一种在流式场景中实现分解机制的方法。他们的工作在几个方面与我们的不同：它只处理分解为 $\textit{Toeplitz}$ 矩阵的情况，只考虑 $\textit{最大}$ 误差，并且使用了基于有理函数近似的不同技术，该技术似乎不如我们的分箱方法通用。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日