Differential Privacy (DP) provides a rigorous framework for deriving privacy-preserving estimators by injecting calibrated noise to mask individual contributions while preserving population-level insights. Its central challenge lies in the privacy-utility trade-off: calibrating noise levels to ensure robust protection without compromising statistical performance. Standard DP methods struggle with a particular class of two-stage problems prevalent in individualized treatment rules (ITRs) and causal inference. In these settings, data-dependent weights are first computed to satisfy distributional constraints, such as covariate balance, before the final parameter of interest is estimated. Current DP approaches often privatize stages independently, which either degrades weight efficacy-leading to biased and inconsistent estimates-or introduces excessive noise to account for worst-case scenarios. To address these challenges, we propose the Differentially Private Two-Stage Empirical Risk Minimization (DP-2ERM), a framework that injects a carefully calibrated noise only into the second stage while maintaining privacy for the entire pipeline and preserving the integrity of the first stage weights. Our theoretical contributions include deterministic bounds on weight perturbations across various widely used weighting methods, and probabilistic bounds on sensitivity for the final estimator. Simulations and real-world applications in ITR demonstrate that DP-2ERM significantly enhances utility over existing methods while providing rigorous privacy guarantees.
翻译:差分隐私(DP)通过注入校准噪声以掩盖个体贡献,同时保留群体层面的洞察,为推导隐私保护估计量提供了严谨框架。其核心挑战在于隐私-效用权衡:校准噪声水平以确保稳健保护的同时不损害统计性能。标准DP方法在处理个体化治疗规则(ITRs)与因果推断中普遍存在的一类两阶段问题时面临困难。在此类设定中,需首先计算满足分布约束(如协变量平衡)的数据依赖权重,随后估计最终目标参数。现有DP方法通常对各阶段进行独立隐私化处理,这要么导致权重有效性降低——产生有偏且不一致的估计,要么为应对最坏情况而引入过量噪声。为解决这些挑战,我们提出差分隐私两阶段经验风险最小化(DP-2ERM)框架,该框架仅向第二阶段注入精心校准的噪声,在保障全流程隐私安全的同时,维护第一阶段权重的完整性。我们的理论贡献包括:针对多种广泛使用的加权方法提出权重扰动的确定性界限,以及对最终估计量敏感度的概率界限。在ITR中的仿真与实际应用表明,DP-2ERM在提供严格隐私保障的同时,较现有方法显著提升了效用。