The challenge of producing accurate statistics while respecting the privacy of the individuals in a sample is an important area of research. We study minimax lower bounds for classes of differentially private estimators. In particular, we show how to characterize the power of a statistical test under differential privacy in a plug-and-play fashion by solving an appropriate transport problem. With specific coupling constructions, this observation allows us to derive Le Cam-type and Fano-type inequalities not only for regular definitions of differential privacy but also for those based on Renyi divergence. We then proceed to illustrate our results on three simple, fully worked out examples. In particular, we show that the problem class has a huge importance on the provable degradation of utility due to privacy. In certain scenarios, we show that maintaining privacy results in a noticeable reduction in performance only when the level of privacy protection is very high. Conversely, for other problems, even a modest level of privacy protection can lead to a significant decrease in performance. Finally, we demonstrate that the DP-SGLD algorithm, a private convex solver, can be employed for maximum likelihood estimation with a high degree of confidence, as it provides near-optimal results with respect to both the size of the sample and the level of privacy protection. This algorithm is applicable to a broad range of parametric estimation procedures, including exponential families.
翻译:在保护样本个体隐私的前提下生成准确统计量是一项重要的研究领域。本文研究差分隐私估计器类的极小化下界。特别地,我们通过求解适当的传输问题,以即插即用方式刻画差分隐私下统计检验的势函数。借助特定的耦合构造,该观察不仅使我们能够针对常规差分隐私定义推导出莱卡姆型与法诺型不等式,还能推广至基于Renyi散度的隐私定义。随后,我们通过三个完整简明的示例展示所获结果。研究表明,问题类别对隐私导致的效用退化程度具有决定性影响。在某些场景中,隐私保护仅在隐私保护水平极高时才会导致显著的性能下降;而在其他问题上,即使中等程度的隐私保护也可能引发性能显著降低。最后,我们证明隐私凸优化求解器DP-SGLD算法可用于高置信度的极大似然估计,因其在样本规模和隐私保护水平两个维度均能实现近最优结果。该算法适用于包括指数族在内的广泛参数估计过程。