Fine-tuning on task-specific datasets is a widely-embraced paradigm of harnessing the powerful capability of pretrained LLMs for various downstream tasks. Due to the popularity of LLMs fine-tuning and its accompanying privacy concerns, differentially private (DP) fine-tuning of pretrained LLMs has been widely used to safeguarding the privacy of task-specific datasets. Lying at the design core of DP LLM fine-tuning methods is the satisfactory tradeoff among privacy, utility, and scalability. Most existing methods build upon the seminal work of DP-SGD. Despite pushing the scalability of DP-SGD to its limit, DP-SGD-based fine-tuning methods are unfortunately limited by the inherent inefficiency of SGD. In this paper, we investigate the potential of DP zeroth-order methods for LLM pretraining, which avoids the scalability bottleneck of SGD by approximating the gradient with the more efficient zeroth-order gradient. Rather than treating the zeroth-order method as a drop-in replacement for SGD, this paper presents a comprehensive study both theoretically and empirically. First, we propose the stagewise DP zeroth-order method (DP-ZOSO) that dynamically schedules key hyperparameters. This design is grounded on the synergy between DP random perturbation and the gradient approximation error of the zeroth-order method, and its effect on fine-tuning trajectory. We provide theoretical analysis for both proposed methods. We conduct extensive empirical analysis on both encoder-only masked language model and decoder-only autoregressive language model, achieving impressive results in terms of scalability and utility regardless of the class of tasks (compared with DPZero, DP-ZOPO improves $4.5\%$ on SST-5, $5.5\%$ on MNLI with RoBERTa-Large and 9.2\% on CB, 3.9\% on BoolQ with OPT-2.7b when $\epsilon=4$, demonstrates more significant enhancement in performance on more complicated tasks).
翻译:在特定任务数据集上进行微调,是广泛采用的利用预训练大语言模型强大能力以应对各种下游任务的范式。由于大语言模型微调的普及及其伴随的隐私担忧,预训练大语言模型的差分隐私微调已被广泛用于保护任务特定数据集的隐私。差分隐私大语言模型微调方法设计的核心在于实现隐私性、效用和可扩展性之间的满意权衡。大多数现有方法建立在开创性的差分隐私随机梯度下降工作之上。尽管已将差分隐私随机梯度下降的可扩展性推至极限,但基于差分隐私随机梯度下降的微调方法不幸地受限于随机梯度下降固有的低效性。本文研究了差分隐私零阶方法用于大语言模型预训练的潜力,该方法通过使用更高效的零阶梯度来近似梯度,从而避免了随机梯度下降的可扩展性瓶颈。本文并未将零阶方法简单地视为随机梯度下降的替代品,而是从理论和实证两方面进行了全面研究。首先,我们提出了动态调度关键超参数的分阶段差分隐私零阶方法。该设计基于差分隐私随机扰动与零阶方法梯度近似误差之间的协同作用,及其对微调轨迹的影响。我们为所提出的方法提供了理论分析。我们在仅编码器的掩码语言模型和仅解码器的自回归语言模型上进行了广泛的实证分析,在可扩展性和效用方面取得了令人印象深刻的结果,且不受任务类别影响(与DPZero相比,DP-ZOPO在RoBERTa-Large上于SST-5提升$4.5\%$,在MNLI提升$5.5\%$;在OPT-2.7b上于CB提升$9.2\%$,在BoolQ提升$3.9\%$,其中$\epsilon=4$,表明在更复杂的任务上性能提升更为显著)。