Achieving optimal statistical performance while ensuring the privacy of personal data is a challenging yet crucial objective in modern data analysis. However, characterizing the optimality, particularly the minimax lower bound, under privacy constraints is technically difficult. To address this issue, we propose a novel approach called the score attack, which provides a lower bound on the differential-privacy-constrained minimax risk of parameter estimation. The score attack method is based on the tracing attack concept in differential privacy and can be applied to any statistical model with a well-defined score statistic. It can optimally lower bound the minimax risk of estimating unknown model parameters, up to a logarithmic factor, while ensuring differential privacy for a range of statistical problems. We demonstrate the effectiveness and optimality of this general method in various examples, such as the generalized linear model in both classical and high-dimensional sparse settings, the Bradley-Terry-Luce model for pairwise comparisons, and nonparametric regression over the Sobolev class.
翻译:在实现个人数据隐私保护的同时达到最优统计性能,是现代数据分析中一项富有挑战性且至关重要的目标。然而,在隐私约束下刻画最优性,特别是极小化极大下界,在技术上存在困难。为解决这一问题,我们提出了一种称为“分数攻击”的新方法,该方法为参数估计在差分隐私约束下的极小化极大风险提供了下界。分数攻击方法基于差分隐私中的追踪攻击概念,可应用于任何具有明确定义分数统计量的统计模型。该方法能够以对数因子为代价,在确保差分隐私的前提下,为一系列统计问题最优地给出未知模型参数估计的极小化极大风险下界。我们通过多个实例展示了该通用方法的有效性和最优性,包括经典设置和高维稀疏设置下的广义线性模型、用于成对比较的布拉德利-特里-卢斯模型,以及索博列夫函数类上的非参数回归。