While following different technical routes, both low-rank and orthogonal adaptation techniques can efficiently adapt large-scale pre-training models in specific tasks or domains based on a small piece of trainable parameters. In this study, we bridge the gap between these two techniques, proposing a simple but effective adaptation method based on Householder reflections. Given a pre-trained model, our method fine-tunes its layers by multiplying each frozen weight matrix with an orthogonal matrix constructed by a chain of learnable Householder reflections (HRs). This HR-based orthogonal fine-tuning is equivalent to an adaptive low-rank adaptation. Moreover, we show that the orthogonality of the reflection planes corresponding to the HRs impacts the model capacity and regularity. The analysis motivates us to regularize the orthogonality of the HRs, leading to different implementations of the proposed Householder reflection adaptation (HRA) method. Compared with state-of-the-art methods, HRA achieves superior performance with fewer learnable parameters when adapting large language models and conditional image generators. The code of the experiments is available at \url{https://github.com/DaShenZi721/HRA}, and the method has been merged into the \href{https://github.com/huggingface/peft}{PEFT} package.
翻译:尽管遵循不同的技术路线,低秩适应与正交适应技术均能基于少量可训练参数,高效地适配大规模预训练模型至特定任务或领域。本研究旨在弥合这两种技术之间的鸿沟,提出一种基于Householder反射的简洁而有效的适应方法。给定预训练模型,本方法通过将每个冻结的权重矩阵与由可学习Householder反射链构造的正交矩阵相乘,实现对模型各层的微调。这种基于Householder反射的正交微调等价于一种自适应的低秩适应。此外,我们证明了与Householder反射对应的反射平面的正交性会影响模型容量与正则性。该分析启发我们对Householder反射的正交性进行正则化,从而推导出所提出的Householder反射适应方法的不同实现形式。与现有先进方法相比,HRA在适配大语言模型和条件图像生成器时,能以更少的可学习参数取得更优的性能。实验代码公开于\url{https://github.com/DaShenZi721/HRA},该方法已并入\href{https://github.com/huggingface/peft}{PEFT}工具包。