We analyze to what extent final users can infer information about the level of protection of their data when the data obfuscation mechanism is a priori unknown to them (the so-called ''black-box'' scenario). In particular, we delve into the investigation of two notions of local differential privacy (LDP), namely {\epsilon}-LDP and R\'enyi LDP. On one hand, we prove that, without any assumption on the underlying distributions, it is not possible to have an algorithm able to infer the level of data protection with provable guarantees; this result also holds for the central versions of the two notions of DP considered. On the other hand, we demonstrate that, under reasonable assumptions (namely, Lipschitzness of the involved densities on a closed interval), such guarantees exist and can be achieved by a simple histogram-based estimator. We validate our results experimentally and we note that, on a particularly well-behaved distribution (namely, the Laplace noise), our method gives even better results than expected, in the sense that in practice the number of samples needed to achieve the desired confidence is smaller than the theoretical bound, and the estimation of {\epsilon} is more precise than predicted.
翻译:我们分析了在数据混淆机制对最终用户先验未知的情况下(即所谓的“黑箱”场景),用户能在多大程度上推断其数据保护水平的信息。具体而言,我们深入研究了两种局部差分隐私概念,即ε-局部差分隐私和Rényi局部差分隐私。一方面,我们证明,在不对底层分布做任何假设的前提下,不可能存在一种能够以可证明的保证推断数据保护水平的算法;该结论同样适用于所考虑的两种差分隐私概念的集中式版本。另一方面,我们证明,在合理假设下(即封闭区间内所涉密度的Lipschitz连续性),这种保证是存在的,并且可以通过一种简单的基于直方图的估计器实现。我们通过实验验证了结果,并注意到,在一种表现特别良好的分布(即拉普拉斯噪声)下,我们的方法得到了比预期更好的结果,即实际上达到所需置信度所需的样本数量小于理论界限,且对ε的估计比预测的更精确。