We consider the problem of model selection in a high-dimensional sparse linear regression model under privacy constraints. We propose a differentially private best subset selection method with strong utility properties by adopting the well-known exponential mechanism for selecting the best model. We propose an efficient Metropolis-Hastings algorithm and establish that it enjoys polynomial mixing time to its stationary distribution. Furthermore, we also establish approximate differential privacy for the estimates of the mixed Metropolis-Hastings chain. Finally, we perform some illustrative experiments that show the strong utility of our algorithm.
翻译:我们研究了隐私约束下高维稀疏线性回归模型中的模型选择问题。通过采用著名的指数机制选择最佳模型,我们提出了一种具有强效用性质的差分隐私最佳子集选择方法。我们设计了一种高效的Metropolis-Hastings算法,并证明了该算法在多项式时间内混合至其平稳分布。此外,我们还证明了混合Metropolis-Hastings链的估计值满足近似差分隐私。最后,我们通过若干说明性实验展示了所提算法的强效用性。