We study privacy-preserving sparse linear regression in the high-dimensional regime, focusing on the LASSO estimator. We analyze two widely used mechanisms for differential privacy: output perturbation, which injects noise into the estimator, and objective perturbation, which adds a random linear term to the loss function. Using approximate message passing (AMP), we characterize the typical behavior of these estimators under random design and privacy noise. To quantify privacy, we adopt typical-case measures, including the on-average KL divergence, which admits a hypothesis-testing interpretation in terms of distinguishability between neighboring datasets. Our analysis reveals that sparsity plays a central role in shaping the privacy-accuracy trade-off: stronger regularization can improve privacy by stabilizing the estimator against single-point data changes. We further show that the two mechanisms exhibit qualitatively different behaviors. In particular, for objective perturbation, increasing the noise level can have non-monotonic effects, and excessive noise may destabilize the estimator, leading to increased sensitivity to data perturbations. Our results demonstrate that AMP provides a powerful framework for analyzing privacy-accuracy trade-offs in high-dimensional sparse models.
翻译:我们研究高维场景下的隐私保护稀疏线性回归,重点关注LASSO估计量。我们分析了两种广泛使用的差分隐私机制:输出扰动(向估计量注入噪声)和客观扰动(向损失函数添加随机线性项)。利用近似消息传递(AMP),我们刻画了这些估计量在随机设计和隐私噪声下的典型行为。为了量化隐私,我们采用典型性度量,包括平均KL散度,该度量在假设检验中可解释为相邻数据集的可区分性。我们的分析表明,稀疏性在塑造隐私-精度权衡中起核心作用:更强的正则化可以通过稳定估计量对单点数据变化的响应来提升隐私。我们进一步证明,这两种机制表现出定性不同的行为。特别地,对于客观扰动,增加噪声水平可能产生非单调效应,而过度的噪声可能破坏估计量的稳定性,导致对数据扰动的敏感性增加。我们的结果表明,AMP为分析高维稀疏模型中的隐私-精度权衡提供了强大框架。