We propose Conditional Adapter (CoDA), a parameter-efficient transfer learning method that also improves inference efficiency. CoDA generalizes beyond standard adapter approaches to enable a new way of balancing speed and accuracy using conditional computation. Starting with an existing dense pretrained model, CoDA adds sparse activation together with a small number of new parameters and a light-weight training phase. Our experiments demonstrate that the CoDA approach provides an unexpectedly efficient way to transfer knowledge. Across a variety of language, vision, and speech tasks, CoDA achieves a 2x to 8x inference speed-up compared to the state-of-the-art Adapter approaches with moderate to no accuracy loss and the same parameter efficiency.
翻译:我们提出条件适配器(CoDA),一种参数高效且能提升推理效率的迁移学习方法。CoDA超越了标准适配器方法,通过条件计算实现了速度与精度权衡的新范式。该方法在现有稠密预训练模型基础上,结合稀疏激活机制,仅需少量新增参数和轻量级训练阶段。实验表明,CoDA提供了一种出乎意料高效的迁移学习方案。在语言、视觉及语音等多项任务中,相较于当前最优的适配器方法,CoDA在保持同等参数效率且精度损失极小甚至无损失的情况下,实现了2至8倍的推理速度提升。