PEFT-Ref: A Modular Reference Architecture and Typology for Parameter-Efficient Finetuning Techniques

Recent parameter-efficient finetuning (PEFT) techniques aim to improve over the considerable cost of fully finetuning large pretrained language models (PLM). As different PEFT techniques proliferate, it is becoming difficult to compare them, in particular in terms of (i) the structure and functionality they add to the PLM, (ii) the different types and degrees of efficiency improvements achieved, (iii) performance at different downstream tasks, and (iv) how differences in structure and functionality relate to efficiency and task performance. To facilitate such comparisons, this paper presents a reference architecture which standardises aspects shared by different PEFT techniques, while isolating differences to specific locations and interactions with the standard components. Through this process of standardising and isolating differences, a modular view of PEFT techniques emerges, supporting not only direct comparison of different techniques and their efficiency and task performance, but also systematic exploration of reusability and composability of the different types of finetuned modules. We demonstrate how the reference architecture can be applied to understand properties and relative advantages of PEFT techniques, hence to inform selection of techniques for specific tasks, and design choices for new PEFT techniques.

翻译：近期提出的参数高效微调（PEFT）技术旨在改进对大型预训练语言模型（PLM）进行完全微调所需的显著成本。随着不同PEFT技术的不断涌现，对其进行系统比较变得愈发困难，尤其是涉及以下方面：（i）它们为PLM添加的结构与功能特性；（ii）所实现的效率提升类型与程度；（iii）在不同下游任务中的性能表现；（iv）结构与功能差异如何关联效率与任务性能。为促进此类比较，本文提出一种参考架构，该架构标准化不同PEFT技术共有的技术特征，同时将差异性隔离至与标准组件交互的特定位置与方式。通过这种标准化与差异隔离过程，PEFT技术的模块化视图得以形成，不仅支持不同技术及其效率与任务性能的直接比较，还能系统探索不同类型微调模块的可复用性与可组合性。我们展示了如何运用该参考架构理解PEFT技术的特性与相对优势，从而为特定任务的技术选择提供依据，并为新型PEFT技术的设计决策提供指导。

相关内容

Performance

关注 3

Performance：International Symposium on Computer Performance Modeling, Measurements and Evaluation。 Explanation：计算机性能建模、测量和评估国际研讨会。 Publisher：ACM。 SIT：http://dblp.uni-trier.de/db/conf/performance/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日