We consider the problem of model selection in a high-dimensional sparse linear regression model under privacy constraints. We propose a differentially private best subset selection method with strong utility properties by adopting the well-known exponential mechanism for selecting the best model. We propose an efficient Metropolis-Hastings algorithm and establish that it enjoys polynomial mixing time to its stationary distribution. Furthermore, we also establish approximate differential privacy for the estimates of the mixed Metropolis-Hastings chain. Finally, we perform some illustrative experiments that show the strong utility of our algorithm.
翻译:本文研究高维稀疏线性回归模型在隐私约束下的模型选择问题。我们提出一种具有强效用特性的差分隐私最优子集选择方法,该方法采用经典的指数机制选择最优模型。我们设计了一种高效的Metropolis-Hastings算法,并证明其达到稳态分布所需混合时间具有多项式复杂度。此外,我们还证明了混合Metropolis-Hastings链的估计值满足近似差分隐私特性。最后,我们通过实验验证了所提算法具有显著的效用优势。