This paper presents novel methodologies for conducting practical differentially private (DP) estimation and inference in high-dimensional linear regression. We start by proposing a differentially private Bayesian Information Criterion (BIC) for selecting the unknown sparsity parameter in DP-Lasso, eliminating the need for prior knowledge of model sparsity, a requisite in the existing literature. Then we propose a differentially private debiased LASSO algorithm that enables privacy-preserving inference on regression parameters. Our proposed method enables accurate and private inference on the regression parameters by leveraging the inherent sparsity of high-dimensional linear regression models. Additionally, we address the issue of multiple testing in high-dimensional linear regression by introducing a differentially private multiple testing procedure that controls the false discovery rate (FDR). This allows for accurate and privacy-preserving identification of significant predictors in the regression model. Through extensive simulations and real data analysis, we demonstrate the efficacy of our proposed methods in conducting inference for high-dimensional linear models while safeguarding privacy and controlling the FDR.
翻译:本文提出了在高维线性回归中进行实用差分隐私(DP)估计与推断的新方法。我们首先提出一种差分隐私贝叶斯信息准则(BIC),用于选择DP-Lasso中未知的稀疏参数,从而消除了现有文献中对模型稀疏性先验知识的依赖。接着,我们提出一种差分隐私去偏LASSO算法,该算法能够实现回归参数的隐私保护推断。通过利用高维线性回归模型的内在稀疏性,所提方法能够对回归参数进行准确且隐私保护的推断。此外,我们引入了一种差分隐私多重检验程序,通过控制错误发现率(FDR)来解决高维线性回归中的多重检验问题,从而实现对回归模型中显著预测变量的准确且隐私保护的识别。通过大量模拟实验和真实数据分析,我们验证了所提方法在保障隐私并控制FDR的前提下,对高维线性模型进行推断的有效性。