In this paper, we introduce a subspace-inspired Low-Rank Adaptation (LoRA) method, which is computationally efficient, easy to implement, and readily applicable to large language, multimodal, and diffusion models. Initially, we equivalently decompose the weights of LoRA into two subspaces, and find that simply mixing them can enhance performance. To study such a phenomenon, we revisit it through a fine-grained subspace lens, showing that such modification is equivalent to employing a fixed mixer to fuse the subspaces. To be more flexible, we jointly learn the mixer with the original LoRA weights, and term the method Mixture-of-Subspaces LoRA (MoSLoRA). MoSLoRA consistently outperforms LoRA on tasks in different modalities, including commonsense reasoning, visual instruction tuning, and subject-driven text-to-image generation, demonstrating its effectiveness and robustness. Codes are available at https://github.com/wutaiqiang/MoSLoRA.
翻译:本文提出了一种受子空间启发的低秩自适应(LoRA)方法,该方法计算高效、易于实现,可直接应用于大型语言模型、多模态模型和扩散模型。我们首先将LoRA的权重等效分解为两个子空间,发现仅通过简单混合即可提升性能。为探究此现象,我们通过细粒度子空间视角重新审视该方法,证明此类修改等价于采用固定混合器融合子空间。为实现更高灵活性,我们联合学习混合器与原始LoRA权重,并将该方法命名为子空间混合LoRA(MoSLoRA)。MoSLoRA在不同模态任务中持续优于原始LoRA,包括常识推理、视觉指令微调以及主体驱动的文本到图像生成,证明了其有效性与鲁棒性。代码发布于https://github.com/wutaiqiang/MoSLoRA。