We revisit the problem of differentially private squared error linear regression. We observe that existing state-of-the-art methods are sensitive to the choice of hyper-parameters -- including the ``clipping threshold'' that cannot be set optimally in a data-independent way. We give a new algorithm for private linear regression based on gradient boosting. We show that our method consistently improves over the previous state of the art when the clipping threshold is taken to be fixed without knowledge of the data, rather than optimized in a non-private way -- and that even when we optimize the clipping threshold non-privately, our algorithm is no worse. In addition to a comprehensive set of experiments, we give theoretical insights to explain this behavior.
翻译:我们重新审视了差分隐私平方误差线性回归问题。观察到现有最先进方法对超参数选择敏感——包括无法以数据无关方式最优设定的“裁剪阈值”。我们提出了一种基于梯度提升的私有线性回归新算法。实验表明,当裁剪阈值在无数据先验知识的情况下固定设定时(而非以非隐私方式优化),我们的方法持续优于此前最先进的方法;即便在非隐私地优化裁剪阈值时,我们的算法性能也不逊色。除全面的实验验证外,我们给出了解释该行为的理论洞见。