We show that variational learning can significantly improve the accuracy and calibration of Low-Rank Adaptation (LoRA) without a substantial increase in the cost. We replace AdamW by the Improved Variational Online Newton (IVON) algorithm to finetune large language models. For Llama-2 with 7 billion parameters, IVON improves the accuracy over AdamW by 2.8% and expected calibration error by 4.6%. The accuracy is also better than the other Bayesian alternatives, yet the cost is lower and the implementation is easier. Our work provides additional evidence for the effectiveness of IVON for large language models. The code is available at https://github.com/team-approx-bayes/ivon-lora.
翻译:我们证明,变分学习能够显著提升低秩自适应(LoRA)的准确性和校准性能,而不会带来显著的成本增加。我们采用改进的变分在线牛顿(IVON)算法替代AdamW来微调大型语言模型。对于拥有70亿参数的Llama-2模型,IVON相较于AdamW将准确率提升了2.8%,预期校准误差降低了4.6%。其准确性也优于其他贝叶斯替代方法,同时成本更低、实现更简单。我们的工作为IVON在大型语言模型中的有效性提供了进一步证据。代码可在 https://github.com/team-approx-bayes/ivon-lora 获取。