Parameter-efficient transfer learning (PETL) based on large-scale pre-trained foundation models has achieved great success in various downstream applications. Existing tuning methods, such as prompt, prefix, and adapter, perform task-specific lightweight adjustments to different parts of the original architecture. However, they take effect on only some parts of the pre-trained models, i.e., only the feed-forward layers or the self-attention layers, which leaves the remaining frozen structures unable to adapt to the data distributions of downstream tasks. Further, the existing structures are strongly coupled with the Transformers, hindering parameter-efficient deployment as well as the design flexibility for new approaches. In this paper, we revisit the design paradigm of PETL and derive a unified framework U-Tuning for parameter-efficient transfer learning, which is composed of an operation with frozen parameters and a unified tuner that adapts the operation for downstream applications. The U-Tuning framework can simultaneously encompass existing methods and derive new approaches for parameter-efficient transfer learning, which prove to achieve on-par or better performances on CIFAR-100 and FGVC datasets when compared with existing PETL methods.
翻译:基于大规模预训练基础模型的参数高效迁移学习已在各类下游应用中取得了巨大成功。现有的微调方法(如提示学习、前缀微调和适配器微调)对原始架构的不同部分进行任务特定的轻量级调整。然而,它们仅作用于预训练模型的部分组件(即仅前馈层或仅自注意力层),导致其余冻结结构无法适应下游任务的数据分布。此外,现有结构与Transformer模型高度耦合,阻碍了参数高效部署及新方法的设计灵活性。本文重新审视参数高效迁移学习的设计范式,推导出统一的参数高效迁移学习框架U-Tuning,该框架由冻结参数操作与统一适配器组成,后者可针对下游应用调整相应操作。该U-Tuning框架可同时涵盖现有方法并衍生出新的参数高效迁移学习方案,在CIFAR-100和FGVC数据集上的实验证明,其性能与现有PETL方法相当或更优。