Lower Bounds for Differential Privacy Under Continual Observation and Online Threshold Queries

One of the most basic problems for studying the "price of privacy over time" is the so called private counter problem, introduced by Dwork et al. (2010) and Chan et al. (2010). In this problem, we aim to track the number of events that occur over time, while hiding the existence of every single event. More specifically, in every time step $t\in[T]$ we learn (in an online fashion) that $\Delta_t\geq 0$ new events have occurred, and must respond with an estimate $n_t\approx\sum_{j=1}^t \Delta_j$. The privacy requirement is that all of the outputs together, across all time steps, satisfy event level differential privacy. The main question here is how our error needs to depend on the total number of time steps $T$ and the total number of events $n$. Dwork et al. (2015) showed an upper bound of $O\left(\log(T)+\log^2(n)\right)$, and Henzinger et al. (2023) showed a lower bound of $\Omega\left(\min\{\log n, \log T\}\right)$. We show a new lower bound of $\Omega\left(\min\{n,\log T\}\right)$, which is tight w.r.t. the dependence on $T$, and is tight in the sparse case where $\log^2 n=O(\log T)$. Our lower bound has the following implications: $\bullet$ We show that our lower bound extends to the "online thresholds problem", where the goal is to privately answer many "quantile queries" when these queries are presented one-by-one. This resolves an open question of Bun et al. (2017). $\bullet$ Our lower bound implies, for the first time, a separation between the number of mistakes obtainable by a private online learner and a non-private online learner. This partially resolves a COLT'22 open question published by Sanyal and Ramponi. $\bullet$ Our lower bound also yields the first separation between the standard model of private online learning and a recently proposed relaxed variant of it, called private online prediction.

翻译：研究“随时间推移的隐私代价”的最基本问题之一是Dwork等人（2010）和Chan等人（2010）提出的私有计数器问题。在该问题中，我们需在隐藏每个事件存在性的前提下，追踪随时间发生的事件数量。具体而言，在每个时间步$t\in[T]$中，我们（以在线方式）获知发生了$\Delta_t\geq 0$个新事件，并需给出估计值$n_t\approx\sum_{j=1}^t \Delta_j$。隐私要求是：所有时间步的输出集合需满足事件级差分隐私。核心问题在于误差如何依赖于总时间步数$T$与总事件数$n$。Dwork等人（2015）给出了$O\left(\log(T)+\log^2(n)\right)$的上界，而Henzinger等人（2023）给出了$\Omega\left(\min\{\log n, \log T\}\right)$的下界。我们证明了新的下界$\Omega\left(\min\{n,\log T\}\right)$，该下界对$T$的依赖是紧的，且在$\log^2 n=O(\log T)$的稀疏情形下也是紧的。该下界具有以下推论：$\bullet$ 我们证明该下界可推广至“在线阈值问题”（目标是在逐个呈现查询时私密地回答多个“分位点查询”），从而解决了Bun等人（2017）提出的开放问题。$\bullet$ 我们的下界首次揭示了私有在线学习器与非私有在线学习器在可达到的错误次数上存在分离，部分解决了Sanyal与Ramponi在COLT'22提出的开放问题。$\bullet$ 该下界还首次实现了标准私有在线学习模型与近期提出的松弛变体（称为私有在线预测）之间的分离。