Parameter-efficient fine-tuning (PEFT) methods have become the standard paradigm for adapting large-scale models. Among these techniques, Weight-Decomposed Low-Rank Adaptation (DoRA) has been shown to improve both the learning capacity and training stability of the Low-Rank Adaptation (LoRA) method by explicitly decomposing pre-trained weights into magnitude and directional components. In this work, we propose DoRAN, a new technique designed to stabilize training and boost the sample efficiency of DoRA. Our framework introduces two key components: (i) the injection of learnable noise into the denominator of DoRA weight decomposition, which serves as an adaptive regularizer to mitigate instabilities and improve the estimation rate of low-rank matrices; and (ii) the replacement of static low-rank matrices with auxiliary networks that generate them dynamically, enabling parameter coupling between the query and value projection matrices, leading to improved sample efficiency both theoretically and empirically. Comprehensive experiments on vision and language benchmarks show that DoRAN consistently outperforms LoRA, DoRA, and other PEFT baselines, underscoring the effectiveness of combining noise-based regularization with network-based parameter generation.
翻译:参数高效微调方法已成为适应大规模模型的标准范式。在众多技术中,权重分解低秩自适应通过将预训练权重显式分解为幅度与方向分量,被证明能同时提升低秩自适应的学习能力与训练稳定性。本文提出DoRAN——一种旨在稳定DoRA训练并提升其样本效率的新技术。该框架包含两个核心组件:(i)在DoRA权重分解的分母中注入可学习噪声,作为自适应正则化器以缓解训练不稳定性并提升低秩矩阵的估计速率;(ii)用辅助网络动态生成低秩矩阵以替代静态低秩矩阵,实现查询与值投影矩阵间的参数耦合,从而在理论与实证层面均提升样本效率。在视觉与语言基准上的综合实验表明,DoRAN在性能上持续优于LoRA、DoRA及其他参数高效微调基线方法,验证了噪声正则化与网络化参数生成相结合的有效性。