Low-Rank Adaptation (LoRA) is widely used to efficiently adapt Transformers by adding trainable low-rank matrices to attention projections. While effective, these matrices are considered independent for each attention projection (Query, Key, and Value) and each layer. Recent extensions have considered joint, tensor-based adaptations, but only in limited forms and without a systematic framework. We introduce TensLoRA, a unified framework that aggregates LoRA updates into higher-order tensors and models a broad family of tensor-based low-rank adaptations. Our formulation generalizes existing tensor-based methods and enables mode-specific compression rates, allowing parameter budgets to be tailored according to the modality and task. Experiments on vision and language benchmarks reveal that the tensor construction directly impacts performance, sometimes better than standard LoRA under similar parameter counts.
翻译:低秩适应(LoRA)被广泛用于通过向注意力投影添加可训练的低秩矩阵来高效适配Transformer模型。尽管有效,这些矩阵通常被视为每个注意力投影(查询、键和值)及每个层之间相互独立。近期的扩展研究考虑了联合的、基于张量的适配方法,但仅限于有限形式且缺乏系统化框架。本文提出TensLoRA,一个统一框架,将LoRA更新聚合为高阶张量,并对广泛的基于张量的低秩适配方法进行建模。我们的公式推广了现有的基于张量的方法,并支持模态特定的压缩率,允许根据模态和任务定制参数预算。在视觉和语言基准测试上的实验表明,张量构建方式直接影响性能,有时在相似参数规模下甚至优于标准LoRA。