Smooth Lower Bounds for Differentially Private Algorithms via Padding-and-Permuting Fingerprinting Codes

Fingerprinting arguments, first introduced by Bun, Ullman, and Vadhan (STOC 2014), are the most widely used method for establishing lower bounds on the sample complexity or error of approximately differentially private (DP) algorithms. Still, there are many problems in differential privacy for which we don't know suitable lower bounds, and even for problems that we do, the lower bounds are not smooth, and usually become vacuous when the error is larger than some threshold. We present a new framework and tools to generate smooth lower bounds on the sample complexity of differentially private algorithms satisfying very weak accuracy. We illustrate the applicability of our method by providing new lower bounds in various settings: 1. A tight lower bound for DP averaging in the low-accuracy regime, which in particular implies a lower bound for the private 1-cluster problem introduced by Nissim, Stemmer, and Vadhan (PODS 2016). 2. A lower bound on the additive error of DP algorithms for approximate k-means clustering and general (k,z)-clustering, as a function of the multiplicative error, which is tight for a constant multiplication error. 3. A lower bound for estimating the top singular vector of a matrix under DP in low-accuracy regimes, which is a special case of DP subspace estimation studied by Singhal and Steinke (NeurIPS 2021). Our main technique is to apply a padding-and-permuting transformation to a fingerprinting code. However, rather than proving our results using a black-box access to an existing fingerprinting code (e.g., Tardos' code), we develop a new fingerprinting lemma that is stronger than those of Dwork et al. (FOCS 2015) and Bun et al. (SODA 2017), and prove our lower bounds directly from the lemma. Our lemma, in particular, gives a simpler fingerprinting code construction with optimal rate (up to polylogarithmic factors) that is of independent interest.

翻译：指纹论证方法由Bun、Ullman和Vadhan（STOC 2014）首次提出，现已成为建立近似差分隐私（DP）算法样本复杂度或误差下界最广泛使用的方法。然而，差分隐私领域仍存在许多问题缺乏合适的下界结果，即使对于已有下界的问题，这些下界通常不具备平滑性，当误差超过特定阈值时往往会失效。本文提出一个全新框架及相关工具，用于为满足极弱精度条件的差分隐私算法生成样本复杂度的平滑下界。我们通过多个场景下的新下界证明方法的适用性：1. 在低精度区域给出DP均值计算的紧致下界，该结果特别蕴含了Nissim、Stemmer和Vadhan（PODS 2016）所提出的隐私1-聚类问题的下界；2. 针对近似k均值聚类及广义(k,z)-聚类的DP算法，建立以乘法误差为自变量的加性误差下界，该下界在乘法误差为常数时达到紧致；3. 在低精度区域给出矩阵在DP约束下估计顶部奇异向量的下界，这是Singhal与Steinke（NeurIPS 2021）研究的DP子空间估计问题的特例。我们的核心技术是对指纹码实施填充与置换变换。与通过黑箱调用现有指纹码（如Tardos码）证明结果的传统方式不同，我们提出了强于Dwork等人（FOCS 2015）和Bun等人（SODA 2017）的新指纹引理，并直接基于该引理证明下界。特别地，该引理给出了一种具有最优码率（忽略多对数因子）的简洁指纹码构造方案，其本身具有独立的研究价值。