As the number of model parameters increases, parameter-efficient fine-tuning (PEFT) has become the go-to choice for tailoring pre-trained large language models. Low-rank Adaptation (LoRA) uses a low-rank update method to simulate full parameter fine-tuning, which is widely used to reduce resource requirements. However, decreasing the rank encounters challenges with limited representational capacity when compared to full parameter fine-tuning. We present \textbf{SMoA}, a high-rank \textbf{S}tructured \textbf{MO}dulation \textbf{A}dapter that uses fewer trainable parameters while maintaining a higher rank, thereby improving the model's representational capacity and offering improved performance potential. The core idea is to freeze the original pretrained weights and selectively amplify or suppress important features of the original weights across multiple subspaces. The subspace mechanism provides an efficient way to increase the capacity and complexity of a model. We conduct both theoretical analyses and empirical studies on various tasks. Experiment results show that SMoA outperforms LoRA and its variants on 10 tasks, with extensive ablation studies validating its effectiveness.
翻译:随着模型参数数量的增加,参数高效微调(PEFT)已成为定制预训练大语言模型的首选方法。低秩适应(LoRA)采用低秩更新方法来模拟全参数微调,被广泛用于降低资源需求。然而,与全参数微调相比,降低秩会因表征能力有限而面临挑战。我们提出了 \textbf{SMoA},一种高秩的\textbf{结构化调制适配器},它使用更少的可训练参数同时保持更高的秩,从而提升模型的表征能力并提供更优的性能潜力。其核心思想是冻结原始预训练权重,并在多个子空间中有选择性地放大或抑制原始权重的重要特征。该子空间机制为增加模型的容量和复杂性提供了一种高效途径。我们在多种任务上进行了理论分析和实证研究。实验结果表明,SMoA 在10项任务上优于 LoRA 及其变体,广泛的消融研究验证了其有效性。