Analysis of Shuffling Beyond Pure Local Differential Privacy

Shuffling is a powerful way to amplify privacy of a local randomizer in private distributed data analysis. Most existing analyses of how shuffling amplifies privacy are based on the pure local differential privacy (DP) parameter $\varepsilon_0$. This paper raises the question of whether $\varepsilon_0$ adequately captures the privacy amplification. For example, since the Gaussian mechanism does not satisfy pure local DP for any finite $\varepsilon_0$, does it follow that shuffling yields weak amplification? To solve this problem, we revisit the privacy blanket bound of Balle et al. (the blanket divergence) and develop a direct asymptotic analysis that bypasses $\varepsilon_0$. Our key finding is that, asymptotically, the blanket divergence depends on the local mechanism only through a single scalar parameter $χ$ and that this dependence is monotonic. Therefore, this parameter serves as a proxy for shuffling efficiency, which we call the shuffle index. By applying this analysis to both upper and lower bounds of the shuffled mechanism's privacy profile, we obtain a band for its privacy guarantee through shuffle indices. Furthermore, we derive a simple structural, necessary and sufficient condition on the local randomizer under which this band collapses asymptotically. $k$-RR families with $k\ge3$ satisfy this condition, while for generalized Gaussian mechanisms the condition may not hold but the resulting band remains tight. Finally, we complement the asymptotic theory with an FFT-based algorithm for computing the blanket divergence at finite $n$, which offers rigorously controlled relative error and near-linear running time in $n$, providing a practical numerical analysis for shuffle DP.

翻译：混洗是增强私有分布式数据分析中局部随机化器隐私保护能力的一种有效方法。现有关于混洗如何增强隐私的分析大多基于纯局部差分隐私参数$\varepsilon_0$。本文提出疑问：$\varepsilon_0$是否足以捕捉隐私增强效果？例如，由于高斯机制对任何有限$\varepsilon_0$均不满足纯局部差分隐私，这是否意味着混洗只能产生微弱的增强效果？为解决该问题，我们重新审视了Balle等人提出的隐私覆盖界（覆盖散度），并发展了一种绕过$\varepsilon_0$的直接渐近分析方法。我们的核心发现是：渐近意义上，覆盖散度仅通过单个标量参数$χ$依赖于局部机制，且这种依赖关系具有单调性。因此，该参数可作为混洗效率的代理指标，我们称之为混洗指数。通过将该分析应用于混洗机制隐私剖面的上下界，我们通过混洗指数获得了其隐私保证的置信带。进一步地，我们推导出局部随机化器的一个简单结构性充要条件，在该条件下该置信带将渐近收敛。$k\ge3$的$k$-RR族满足该条件，而对于广义高斯机制，该条件可能不成立但所得置信带仍保持紧致性。最后，我们通过基于FFT的有限$n$值覆盖散度计算算法补充了渐近理论，该算法提供严格控制的相对误差和接近线性的$n$时间复杂度，为混洗差分隐私提供了实用的数值分析工具。