As model sizes continue to grow, parameter-efficient fine-tuning has emerged as a powerful alternative to full fine-tuning. While LoRA is widely adopted among these methods, recent research has explored vector-based adaptation methods due to their extreme parameter efficiency. However, these methods typically require substantially higher ranks than LoRA to match its performance, leading to increased training costs. This work introduces GiVA, a gradient-based initialization strategy for vector-based adaptation. It achieves training times comparable to LoRA and maintains the extreme parameter efficiency of vector-based adaptation. We evaluate GiVA across diverse benchmarks, including natural language understanding, natural language generation, and image classification. Experiments show that our approach consistently outperforms or achieves performance competitive with existing vector-based adaptation methods and LoRA while reducing rank requirements by a factor of eight ($8\times$).
翻译:随着模型规模持续增长,参数高效微调已成为全参数微调的有力替代方案。在现有方法中,LoRA被广泛采用,但近期研究开始探索因极致参数效率而著称的向量化自适应方法。然而这些方法通常需要比LoRA高得多的秩才能达到同等性能,导致训练成本增加。本文提出GiVA——一种基于梯度的向量化自适应初始化策略。该方法在保持向量化自适应极致参数效率的同时,实现了与LoRA相当的训练时间。我们在自然语言理解、自然语言生成和图像分类等多个基准测试中评估了GiVA。实验表明,我们的方法在将秩需求降低八倍($8\times$)的情况下,始终优于或达到与现有向量化自适应方法及LoRA相当的竞争力性能。