Parameter-efficient finetuning (PEFT) has become ubiquitous to adapt foundation models to downstream task requirements while retaining their generalization ability. However, the amount of additionally introduced parameters and compute for successful adaptation and hyperparameter searches can explode quickly, especially when deployed at scale to serve numerous individual requests. To ensure effective, parameter-efficient, and hyperparameter-robust adaptation, we propose the ETHER transformation family, which performs Efficient fineTuning via HypErplane Reflections. By design, ETHER transformations require a minimal number of parameters, are less likely to deteriorate model performance, and exhibit robustness to hyperparameter and learning rate choices. In particular, we introduce ETHER and its relaxation ETHER+, which match or outperform existing PEFT methods with significantly fewer parameters ($\sim$$10$-$100$ times lower than LoRA or OFT) across multiple image synthesis and natural language tasks without exhaustive hyperparameter tuning. Finally, we investigate the recent emphasis on Hyperspherical Energy retention for adaptation and raise questions on its practical utility. The code is available at https://github.com/mwbini/ether.
翻译:参数高效微调(PEFT)已成为使基础模型适应下游任务需求同时保持其泛化能力的普遍方法。然而,成功适应和超参数搜索所需额外引入的参数和计算量可能迅速膨胀,尤其是在大规模部署以服务众多个体请求时。为确保有效、参数高效且对超参数鲁棒的适应,我们提出了ETHER变换族,其通过超平面反射实现高效微调。ETHER变换在设计上仅需极少量参数,不易降低模型性能,并对超参数与学习率选择表现出鲁棒性。具体而言,我们提出的ETHER及其松弛变体ETHER+,在多项图像合成与自然语言任务中,以显著更少的参数(比LoRA或OFT低约10-100倍)匹配或超越了现有PEFT方法的性能,且无需进行详尽的超参数调优。最后,我们对近期强调的适应过程中超球面能量保持问题进行了探讨,并对其实际效用提出了质疑。代码发布于https://github.com/mwbini/ether。