We consider the privacy guarantees of an algorithm in which a user's data is used in $k$ steps randomly and uniformly chosen from a sequence (or set) of $t$ differentially private steps. We demonstrate that the privacy guarantees of this sampling scheme can be upper bound by the privacy guarantees of the well-studied independent (or Poisson) subsampling in which each step uses the user's data with probability $(1+ o(1))k/t $. Further, we provide two additional analysis techniques that lead to numerical improvements in some parameter regimes. The case of $k=1$ has been previously studied in the context of DP-SGD in Balle et al. (2020) and very recently in Chua et al. (2024); Choquette-Choo et al. (2024). Privacy analysis of Balle et al. (2020) relies on privacy amplification by shuffling which leads to overly conservative bounds. Privacy analysis of Chua et al. (2024a); Choquette-Choo et al. (2024) relies on Monte Carlo simulations that are computationally prohibitive in many practical scenarios and have additional inherent limitations.
翻译:我们考虑一种算法的隐私保障,其中用户数据在从一序列(或集合)包含$t$个差分隐私步骤中随机均匀选取的$k$个步骤中被使用。我们证明,该抽样方案的隐私保障上界可由经过充分研究的独立(或泊松)子抽样方案的隐私保障所界定,其中每个步骤以概率$(1+ o(1))k/t$使用用户数据。此外,我们提供了两种额外的分析技术,这些技术在某些参数范围内带来了数值上的改进。$k=1$的情况先前已在Balle等人(2020)以及最近的Chua等人(2024)和Choquette-Choo等人(2024)关于DP-SGD的研究中被探讨。Balle等人(2020)的隐私分析依赖于混洗实现的隐私增强,这导致了过于保守的边界。Chua等人(2024a)和Choquette-Choo等人(2024)的隐私分析依赖于蒙特卡洛模拟,这在许多实际场景中计算成本过高,并且存在固有的局限性。