We show that variational learning can significantly improve the accuracy and calibration of Low-Rank Adaptation (LoRA) without a substantial increase in the cost. We replace AdamW by the Improved Variational Online Newton (IVON) algorithm to finetune large language models. For Llama-2 with 7 billion parameters, IVON improves the accuracy over AdamW by 2.8% and expected calibration error by 4.6%. The accuracy is also better than the other Bayesian alternatives, yet the cost is lower and the implementation is easier. Our work provides additional evidence for the effectiveness of IVON for large language models. The code is available at https://github.com/team-approx-bayes/ivon-lora.
翻译:我们证明,变分学习能够显著提升低秩自适应(LoRA)的准确性和校准效果,而不会显著增加计算成本。我们采用改进变分在线牛顿(IVON)算法替代AdamW来微调大型语言模型。对于拥有70亿参数的Llama-2模型,IVON相比AdamW将准确率提升了2.8%,预期校准误差降低了4.6%。其准确性也优于其他贝叶斯替代方法,同时计算成本更低、实现更简便。我们的工作为IVON在大型语言模型中的有效性提供了进一步证据。代码发布于https://github.com/team-approx-bayes/ivon-lora。