Large language models (LLMs) fine-tuning shows excellent implications. However, vanilla fine-tuning methods often require intricate data mixture and repeated experiments for optimal generalization. To address these challenges and streamline the training process, we propose an efficient and universal solution, Dynamic Boosted Annealing (DBA). We obtain a global gradient through zero-learning-rate training on general data, which is subsequently employed for gradient boosting and dynamic training step correction during domain training. In conjunction with annealing learning, we end up establishing a fine-tuning pipeline that relies solely on domain data without collapse. By evaluating both general and domain-specific performance across multiple tasks on several popular base models, DBA achieves an average improvement of 5.8% in joint performance over vanilla fine-tuning. Furthermore, since general data is no longer involved in annealing, repeated experiments led by data mixture are also eliminated. According to our tests, the DBA method can reduce GPU hours by 91.0% compared to the vanilla method.
翻译:大型语言模型(LLM)的微调展现出卓越的应用前景。然而,传统的微调方法通常需要复杂的数据混合和重复实验以实现最优泛化。为应对这些挑战并简化训练流程,我们提出了一种高效且通用的解决方案——动态增强退火(DBA)。我们通过在通用数据上进行零学习率训练获取全局梯度,随后将该梯度用于领域训练期间的梯度增强和动态训练步长校正。结合退火学习策略,我们最终建立了一个仅依赖领域数据且不会崩溃的微调流程。通过在多个热门基础模型上评估通用任务与领域特定任务的性能,DBA在联合性能指标上较传统微调平均提升5.8%。此外,由于通用数据不再参与退火过程,由数据混合主导的重复实验也被彻底消除。根据我们的测试,DBA方法相比传统方法可减少91.0%的GPU计算时耗。