Training differentially private machine learning models requires constraining an individual's contribution to the optimization process. This is achieved by clipping the $2$-norm of their gradient at a predetermined threshold prior to averaging and batch sanitization. This selection adversely influences optimization in two opposing ways: it either exacerbates the bias due to excessive clipping at lower values, or augments sanitization noise at higher values. The choice significantly hinges on factors such as the dataset, model architecture, and even varies within the same optimization, demanding meticulous tuning usually accomplished through a grid search. In order to circumvent the privacy expenses incurred in hyperparameter tuning, we present a novel approach to dynamically optimize the clipping threshold. We treat this threshold as an additional learnable parameter, establishing a clean relationship between the threshold and the cost function. This allows us to optimize the former with gradient descent, with minimal repercussions on the overall privacy analysis. Our method is thoroughly assessed against alternative fixed and adaptive strategies across diverse datasets, tasks, model dimensions, and privacy levels. Our results indicate that it performs comparably or better in the evaluated scenarios, given the same privacy requirements.
翻译:训练差分隐私机器学习模型需要限制个体对优化过程的贡献。这是通过在平均化和批量净化前,将个体梯度的$2$-范数裁剪至预设阈值来实现的。该选择通过两种相反的方式对优化产生不利影响:在低阈值处会因过度裁剪加剧偏差,在高阈值处则会增加净化噪声。具体阈值的选择高度依赖于数据集、模型架构等因素,甚至在同一优化过程中也会发生变化,因此通常需要通过网格搜索进行精细调参。为避免超参数调优带来的隐私开销,我们提出一种动态优化裁剪阈值的新方法。我们将该阈值视为一个额外的可学习参数,在阈值与代价函数之间建立清晰的关系,从而能通过梯度下降优化该阈值,且对整体隐私分析影响极小。我们在不同数据集、任务、模型维度及隐私水平下,将该方法与现有固定及自适应策略进行了全面评估。结果表明,在相同隐私要求下,该方法在评估场景中的表现与现有方法相当或更优。