Low-rank Adaption (LoRA) has been the de-facto parameter-efficient fine-tuning technique for large language models. We present HeteroLoRA, a light-weight search algorithm that leverages zero-cost proxies to allocate the limited LoRA trainable parameters across the model for better fine-tuned performance. In addition to the allocation for the standard LoRA-adapted models, we also demonstrate the efficacy of HeteroLoRA by performing the allocation in a more challenging search space that includes LoRA modules and LoRA-adapted shortcut connections. Experiments show that HeteroLoRA enables improvements in model performance given the same parameter budge. For example, on MRPC, we see an improvement of 1.6% in accuracy with similar training parameter budget. We will open-source our algorithm once the paper is accepted.
翻译:低秩适配(LoRA)已成为大型语言模型参数高效微调的事实标准技术。本文提出HeteroLoRA,一种轻量级搜索算法,利用零成本代理将有限的LoRA可训练参数在模型中进行分配,以获得更好的微调性能。除了标准LoRA适配模型的参数分配外,我们还通过在包含LoRA模块和LoRA适配快捷连接的更具挑战性的搜索空间中进行分配,证明了HeteroLoRA的有效性。实验表明,在相同参数预算下,HeteroLoRA能够提升模型性能。例如在MRPC数据集上,我们在训练参数预算相近的情况下实现了1.6%的准确率提升。论文录用后我们将开源算法代码。