Fine-tuning large language models (LLMs) is computationally intensive because it requires updating all parameters. Low-Rank Adaptation (LoRA) improves efficiency by modifying only a subset of weights but introduces a trade-off between expressivity and computational cost: lower ranks reduce resources but limit expressiveness, while higher ranks enhance expressivity at increased cost. Despite recent advances in adaptive LoRA techniques, existing methods fail to provide a theoretical basis for optimizing the trade-off between model performance and efficiency. We propose Geometric Low-Rank Adaptation (GeLoRA), a novel framework that computes the intrinsic dimensionality of hidden state representations to adaptively select LoRA ranks. We demonstrate that the intrinsic dimension provides a lower bound for the optimal rank of LoRA matrices, allowing for a principled selection that balances efficiency and expressivity. GeLoRA dynamically adjusts the rank for each layer based on the intrinsic dimensionality of its input and output representations, recognizing that not all model parameters equally impact fine-tuning. Empirical validation on multiple tasks shows that GeLoRA consistently outperforms recent baselines within the same parameter budget.
翻译:微调大型语言模型(LLMs)的计算成本高昂,因其需要更新全部参数。低秩自适应(LoRA)通过仅修改部分权重提升效率,但引入了表达能力与计算成本之间的权衡:较低秩可减少资源消耗但限制表达能力,较高秩虽增强表达能力却增加成本。尽管近期自适应LoRA技术有所进展,现有方法仍缺乏优化模型性能与效率权衡的理论基础。本文提出几何低秩自适应(GeLoRA),该框架通过计算隐藏状态表征的本征维度来自适应选择LoRA秩。我们证明本征维度为LoRA矩阵的最优秩提供了下界,从而实现兼顾效率与表达能力的理论化选择。GeLoRA基于各层输入输出表征的本征维度动态调整其秩,这源于并非所有模型参数对微调具有同等影响的认知。在多任务上的实证验证表明,在相同参数量预算下,GeLoRA持续优于现有基线方法。