We design a debiased parametric bootstrap framework for statistical inference from differentially private data. Existing usage of the parametric bootstrap on privatized data ignored or avoided handling possible biases introduced by the privacy mechanism, such as by clamping, a technique employed by the majority of privacy mechanisms. Ignoring these biases leads to under-coverage of confidence intervals and miscalibrated type I errors of hypothesis tests, due to the inconsistency of parameter estimates based on the privatized data. We propose using the indirect inference method to estimate the parameter values consistently, and we use the improved estimator in parametric bootstrap for inference. To implement the indirect estimator, we present a novel simulation-based, adaptive approach along with the theory that establishes the consistency of the corresponding parametric bootstrap estimates, confidence intervals, and hypothesis tests. In particular, we prove that our adaptive indirect estimator achieves the minimum asymptotic variance among all ``well-behaved'' consistent estimators based on the released summary statistic. Our simulation studies show that our framework produces confidence intervals with well-calibrated coverage and performs hypothesis testing with the correct type I error, giving state-of-the-art performance for inference in several settings.
翻译:我们针对差分隐私数据设计了一个去偏的参数自助法推断框架。现有在私有数据上应用参数自助法的工作忽略或回避了隐私机制可能引入的偏差(例如截断法——大多数隐私机制所采用的技术)。由于基于私有数据的参数估计存在不一致性,忽略这些偏差会导致置信区间覆盖不足以及假设检验的I类误差校准失准。我们提出利用间接推断方法一致地估计参数值,并将改进后的估计量用于参数自助法的推断。为实施间接估计量,我们提出了一种新颖的基于仿真的自适应方法,并建立了相应参数自助法估计、置信区间及假设检验一致性的理论。特别地,我们证明:在基于发布汇总统计量的所有"良好行为"的一致估计量中,我们的自适应间接估计量实现了最小渐近方差。模拟研究表明,我们的框架能够生成校准良好的置信区间覆盖,并以正确的I类误差进行假设检验,在多种场景下均达到了当前最优的推断性能。