Smooth Lower Bounds for Differentially Private Algorithms via Padding-and-Permuting Fingerprinting Codes

Fingerprinting arguments, first introduced by Bun, Ullman, and Vadhan (STOC 2014), are the most widely used method for establishing lower bounds on the sample complexity or error of approximately differentially private (DP) algorithms. Still, there are many problems in differential privacy for which we don't know suitable lower bounds, and even for problems that we do, the lower bounds are not smooth, and usually become vacuous when the error is larger than some threshold. In this work, we present a new framework and tools to generate smooth lower bounds on the sample complexity of differentially private algorithms satisfying very weak accuracy. We illustrate the applicability of our method by providing new lower bounds in various settings: 1. A tight lower bound for DP averaging in the low-accuracy regime, which in particular implies a lower bound for the private 1-cluster problem introduced by Nissim, Stemmer, and Vadhan (PODS 2016). 2. A lower bound on the additive error of DP algorithms for approximate k-means clustering, as a function of the multiplicative error, which is tight for a constant multiplication error. 3. A lower bound for estimating the top singular vector of a matrix under DP in low-accuracy regimes, which is a special case of DP subspace estimation studied by Singhal and Steinke (NeurIPS 2021). Our main technique is to apply a padding-and-permuting transformation to a fingerprinting code. However, rather than proving our results using a black-box access to an existing fingerprinting code (e.g., Tardos' code), we develop a new fingerprinting lemma that is stronger than those of Dwork et al. (FOCS 2015) and Bun et al. (SODA 2017), and prove our lower bounds directly from the lemma. Our lemma, in particular, gives a simpler fingerprinting code construction with optimal rate (up to polylogarithmic factors) that is of independent interest.

翻译：指纹论证由Bun、Ullman和Vadhan首次提出（STOC 2014），是目前最常用的方法，用于建立近似差分隐私（DP）算法在样本复杂度或误差方面的下界。然而，差分隐私领域仍存在许多缺乏合适下界的问题，即便已知下界的问题，这些下界也往往是非平滑的，且当误差超过某阈值时通常失效。本文提出了一种新框架与工具，可为满足极弱精度的差分隐私算法生成平滑的样本复杂度下界。我们通过在不同场景下提供新的下界来展示方法的适用性：1. 低精度区间内DP平均的紧下界，这特别意味着Nissim、Stemmer与Vadhan（PODS 2016）提出的私有单聚类问题存在下界。2. 近似k均值聚类的DP算法加性误差下界（作为乘性误差的函数），该下界对常数乘性误差情形是紧的。3. 低精度区间内估计矩阵主奇异向量的DP算法下界，这是Singhal与Steinke（NeurIPS 2021）研究的DP子空间估计的特例。我们的主要技术是对指纹编码施加填充与置换变换。然而，我们并未通过黑盒调用现有指纹编码（如Tardos编码）来证明结论，而是提出一个比Dwork等人（FOCS 2015）和Bun等人（SODA 2017）更强的指纹引理，并直接基于该引理证明下界。特别地，该引理给出了一个具有理论最优速率（至多相差多对数因子）且更简洁的指纹编码构造，这本身也具有独立意义。