Differentially Private Stochastic Gradient Descent (DP-SGD) and its variants have been proposed to ensure rigorous privacy for fine-tuning large-scale pre-trained language models. However, they rely heavily on the Gaussian mechanism, which may overly perturb the gradients and degrade the accuracy, especially in stronger privacy regimes (e.g., the privacy budget $\epsilon < 3$). To address such limitations, we propose a novel Language Model-based Optimal Differential Privacy (LMO-DP) mechanism, which takes the first step to enable the tight composition of accurately fine-tuning (large) language models with a sub-optimal DP mechanism, even in strong privacy regimes (e.g., $0.1\leq \epsilon<3$). Furthermore, we propose a novel offline optimal noise search method to efficiently derive the sub-optimal DP that significantly reduces the noise magnitude. For instance, fine-tuning RoBERTa-large (with 300M parameters) on the SST-2 dataset can achieve an accuracy of 92.20% (given $\epsilon=0.3$, $\delta=10^{-10}$) by drastically outperforming the Gaussian mechanism (e.g., $\sim 50\%$ for small $\epsilon$ and $\delta$). We also draw similar findings on the text generation tasks on GPT-2. Finally, to our best knowledge, LMO-DP is also the first solution to accurately fine-tune Llama-2 with strong differential privacy guarantees. The code will be released soon and available upon request.
翻译:差分隐私随机梯度下降(DP-SGD)及其变体已被提出,用于确保大规模预训练语言模型微调过程中的严格隐私性。然而,这些方法严重依赖于高斯机制,该机制可能过度扰动梯度并降低模型精度,尤其是在强隐私保护区域(例如隐私预算 $\epsilon < 3$)。为应对这些局限性,我们提出了一种新颖的基于语言模型的最优差分隐私(LMO-DP)机制。该机制首次实现了在强隐私区域(例如 $0.1\leq \epsilon<3$)下,使用次优差分隐私机制对(大)语言模型进行精确微调的紧密组合。此外,我们提出了一种新颖的离线最优噪声搜索方法,以高效推导出能显著降低噪声幅度的次优差分隐私机制。例如,在 SST-2 数据集上微调 RoBERTa-large(具有 3 亿参数)时,在给定 $\epsilon=0.3$、$\delta=10^{-10}$ 的条件下,其准确率可达 92.20%,大幅超越了高斯机制的性能(例如,在较小的 $\epsilon$ 和 $\delta$ 下提升约 50%)。我们在 GPT-2 的文本生成任务上也得到了类似的结论。最后,据我们所知,LMO-DP 也是首个能够在强差分隐私保证下精确微调 Llama-2 的解决方案。代码即将发布,可根据请求提供。