We study privacy-preserving sparse linear regression in the high-dimensional regime, focusing on the LASSO estimator. We analyze two widely used mechanisms for differential privacy: output perturbation, which injects noise into the estimator, and objective perturbation, which adds a random linear term to the loss function. Using approximate message passing (AMP), we characterize the typical behavior of these estimators under random design and privacy noise. To quantify privacy, we adopt typical-case measures, including the on-average KL divergence, which admits a hypothesis-testing interpretation in terms of distinguishability between neighboring datasets. Our analysis reveals that sparsity plays a central role in shaping the privacy-accuracy trade-off: stronger regularization can improve privacy by stabilizing the estimator against single-point data changes. We further show that the two mechanisms exhibit qualitatively different behaviors. In particular, for objective perturbation, increasing the noise level can have non-monotonic effects, and excessive noise may destabilize the estimator, leading to increased sensitivity to data perturbations. Our results demonstrate that AMP provides a powerful framework for analyzing privacy-accuracy trade-offs in high-dimensional sparse models.
翻译:我们研究了高维场景下保护隐私的稀疏线性回归问题,以LASSO估计量为核心对象。分析了差分隐私的两种广泛使用的机制:输出扰动(向估计量注入噪声)和目标扰动(向损失函数添加随机线性项)。利用近似消息传递(AMP)方法,我们刻画了这些估计量在随机设计和隐私噪声下的典型行为。为量化隐私,我们采用典型案例测度,包括平均KL散度,该指标可通过假设检验视角解释相邻数据集间的可区分性。分析表明,稀疏性在塑造隐私-精度权衡中起核心作用:更强的正则化可通过稳定估计量对单点数据变化的敏感性来提升隐私保护。我们进一步揭示两种机制表现出差异性行为。特别地,对于目标扰动,增加噪声水平可能产生非单调效应,而过量噪声可能破坏估计量稳定性,导致对数据扰动的敏感性增强。研究证明,AMP为分析高维稀疏模型的隐私-精度权衡提供了强大框架。