We analyze to what extent final users can infer information about the level of protection of their data when the data obfuscation mechanism is a priori unknown to them (the so-called ''black-box'' scenario). In particular, we delve into the investigation of two notions of local differential privacy (LDP), namely {\epsilon}-LDP and R\'enyi LDP. On one hand, we prove that, without any assumption on the underlying distributions, it is not possible to have an algorithm able to infer the level of data protection with provable guarantees; this result also holds for the central versions of the two notions of DP considered. On the other hand, we demonstrate that, under reasonable assumptions (namely, Lipschitzness of the involved densities on a closed interval), such guarantees exist and can be achieved by a simple histogram-based estimator. We validate our results experimentally and we note that, on a particularly well-behaved distribution (namely, the Laplace noise), our method gives even better results than expected, in the sense that in practice the number of samples needed to achieve the desired confidence is smaller than the theoretical bound, and the estimation of {\epsilon} is more precise than predicted.
翻译:我们分析了当数据混淆机制对最终用户先验未知(即所谓的“黑盒”场景)时,用户能在多大程度上推断出其数据保护水平的信息。具体而言,我们深入研究了两种本地差分隐私(LDP)概念,即ε-LDP和Rényi LDP。一方面,我们证明,在没有对底层分布做任何假设的情况下,不可能存在一种算法能够以可证明的保证推断出数据保护水平;这一结果同样适用于所考虑的两种DP概念的集中式版本。另一方面,我们证明,在合理假设下(即相关密度函数在闭区间上满足Lipschitz条件),这样的保证是存在的,并且可以通过简单的基于直方图的估计器实现。我们通过实验验证了结果,并注意到,在一个特别良性的分布(即拉普拉斯噪声)上,我们的方法甚至取得了比预期更好的结果,具体表现为实际中达到所需置信度所需的样本数量少于理论界,并且对ε的估计比预测的更精确。