Differentially private (DP) machine learning pipelines typically involve a two-phase process: non-private pre-training on a public dataset, followed by fine-tuning on private data using DP optimization techniques. In the DP setting, it has been observed that full fine-tuning may not always yield the best test accuracy, even for in-distribution data. This paper (1) analyzes the training dynamics of DP linear probing (LP) and full fine-tuning (FT), and (2) explores the phenomenon of sequential fine-tuning, starting with linear probing and transitioning to full fine-tuning (LP-FT), and its impact on test loss. We provide theoretical insights into the convergence of DP fine-tuning within an overparameterized neural network and establish a utility curve that determines the allocation of privacy budget between linear probing and full fine-tuning. The theoretical results are supported by empirical evaluations on various benchmarks and models. The findings reveal the complex nature of DP fine-tuning methods. These results contribute to a deeper understanding of DP machine learning and highlight the importance of considering the allocation of privacy budget in the fine-tuning process.
翻译:差分隐私(DP)机器学习流程通常包含两个阶段:先在公开数据集上进行非私有的预训练,再使用DP优化技术在私有数据上进行微调。在DP设定下,已有观察表明即便针对同分布数据,完全微调也未必总能取得最佳测试精度。本文(1)分析了DP线性探测和完全微调的训练动态,(2)探究了从线性探测过渡到完全微调的序列微调现象及其对测试损失的影响。我们在过参数化神经网络框架下给出了DP微调收敛性的理论分析,并建立了用于分配线性探测与完全微调间隐私预算的效用曲线。该理论结果在多种基准测试和模型上的实证评估中得到了验证。研究结果揭示了DP微调方法的复杂性,这些发现有助于深化对DP机器学习的理解,并强调了在微调过程中考虑隐私预算分配的重要性。