We present a method for producing unbiased parameter estimates and valid confidence intervals under the constraints of differential privacy, a formal framework for limiting individual information leakage from sensitive data. Prior work in this area is limited in that it is tailored to calculating confidence intervals for specific statistical procedures, such as mean estimation or simple linear regression. While other recent work can produce confidence intervals for more general sets of procedures, they either yield only approximately unbiased estimates, are designed for one-dimensional outputs, or assume significant user knowledge about the data-generating distribution. Our method induces distributions of mean and covariance estimates via the bag of little bootstraps (BLB) and uses them to privately estimate the parameters' sampling distribution via a generalized version of the CoinPress estimation algorithm. If the user can bound the parameters of the BLB-induced parameters and provide heavier-tailed families, the algorithm produces unbiased parameter estimates and valid confidence intervals which hold with arbitrarily high probability. These results hold in high dimensions and for any estimation procedure which behaves nicely under the bootstrap.
翻译:我们提出了一种在差分隐私约束下生成无偏参数估计和有效置信区间的方法。差分隐私是一种限制敏感数据中个体信息泄露的正式框架。先前相关工作存在局限性,其仅针对特定统计过程(如均值估计或简单线性回归)的置信区间计算定制化设计。尽管其他近期研究可对更通用的过程集合生成置信区间,但它们要么仅能提供近似无偏估计,要么专为一维输出设计,要么要求用户对数据生成分布具备显著先验知识。我们的方法通过小自助法袋(BLB)诱导均值与协方差估计的分布,并利用泛化版CoinPress估计算法私有地估计参数的采样分布。若用户能约束BLB诱导参数的界限并提供更重尾的分布族,该算法可生成以任意高概率成立的无偏参数估计和有效置信区间。这些结果适用于高维场景及任意在自助法下表现良好的估计过程。