Low-Rank Adaptation (LoRA) drives research to align its performance with full fine-tuning. However, significant challenges remain: (1) Simply increasing the rank size of LoRA does not effectively capture high-rank information, which leads to a performance bottleneck.(2) MoE-style LoRA methods substantially increase parameters and inference latency, contradicting the goals of efficient fine-tuning and ease of application. To address these challenges, we introduce Mixture of Ranks (MoR), which learns rank-specific information for different tasks based on input and efficiently integrates multi-rank information. We firstly propose a new framework that equates the integration of multiple LoRAs to expanding the rank of LoRA. Moreover, we hypothesize that low-rank LoRA already captures sufficient intrinsic information, and MoR can derive high-rank information through mathematical transformations of the low-rank components. Thus, MoR can reduces the learning difficulty of LoRA and enhances its multi-task capabilities. MoR achieves impressive results, with MoR delivering a 1.31\% performance improvement while using only 93.93\% of the parameters compared to baseline methods.
翻译:低秩自适应(LoRA)推动了旨在使其性能与全微调对齐的研究。然而,重大挑战依然存在:(1)单纯增加LoRA的秩大小并不能有效捕获高秩信息,这导致了性能瓶颈。(2)MoE风格的LoRA方法显著增加了参数和推理延迟,与高效微调和易于应用的目标相悖。为了应对这些挑战,我们引入了混合秩(MoR)方法,它基于输入学习针对不同任务的秩特定信息,并高效地整合多秩信息。我们首先提出了一个新框架,将多个LoRA的集成等价于扩展LoRA的秩。此外,我们假设低秩LoRA已经捕获了足够的内在信息,而MoR可以通过对低秩分量的数学变换推导出高秩信息。因此,MoR可以降低LoRA的学习难度并增强其多任务能力。MoR取得了令人印象深刻的结果,与基线方法相比,MoR在仅使用93.93%参数的情况下,实现了1.31%的性能提升。