We consider the problem of model selection in a high-dimensional sparse linear regression model under privacy constraints. We propose a differentially private best subset selection method with strong utility properties by adopting the well-known exponential mechanism for selecting the best model. We propose an efficient Metropolis-Hastings algorithm and establish that it enjoys polynomial mixing time to its stationary distribution. Furthermore, we also establish approximate differential privacy for the final estimates of the Metropolis-Hastings random walk using its mixing property. Finally, we perform some illustrative experiments that show the strong utility of our algorithm.
翻译:我们考虑在隐私约束下高维稀疏线性回归模型中的模型选择问题。通过采用著名的指数机制来选择最佳模型,我们提出了一种具有强实用性的差分隐私最佳子集选择方法。我们设计了一种高效的Metropolis-Hastings算法,并证明了该算法在多项式时间内可混合至其平稳分布。此外,我们利用该算法的混合性质,进一步证明了Metropolis-Hastings随机游走最终估计的近似差分隐私性。最后,我们通过实验验证了所提算法的强实用性。