Large Language Models (LLMs) have the ability to solve a variety of tasks, such as text summarization and mathematical questions, just out of the box, but they are often trained with a single task in mind. Due to high computational costs, the current trend is to use prompt instruction tuning to better adjust monolithic, pretrained LLMs for new -- but often individual -- downstream tasks. Thus, how one would expand prompt tuning to handle -- concomitantly -- heterogeneous tasks and data distributions is a widely open question. To address this gap, we suggest the use of \emph{Mixture of Prompts}, or MoPs, associated with smart gating functionality: the latter -- whose design is one of the contributions of this paper -- can identify relevant skills embedded in different groups of prompts and dynamically assign combined experts (i.e., collection of prompts), based on the target task. Additionally, MoPs are empirically agnostic to any model compression technique applied -- for efficiency reasons -- as well as instruction data source and task composition. In practice, MoPs can simultaneously mitigate prompt training "interference" in multi-task, multi-source scenarios (e.g., task and data heterogeneity across sources), as well as possible implications from model approximations. As a highlight, MoPs manage to decrease final perplexity from $\sim20\%$ up to $\sim70\%$, as compared to baselines, in the federated scenario, and from $\sim 3\%$ up to $\sim30\%$ in the centralized scenario.
翻译:大型语言模型(LLMs)具备开箱即用解决文本摘要、数学问答等多种任务的能力,但通常以单一任务为目标进行训练。由于计算成本高昂,当前趋势是采用提示指令调优方法,使预训练的单一结构LLM更好地适应新的——但往往是独立的——下游任务。因此,如何扩展提示调优以同时处理异构任务与数据分布,仍是一个广泛开放的问题。为填补这一空白,我们提出结合智能门控机制的“提示混合”(MoPs)方法:本文贡献之一的门控设计能识别不同提示组中嵌入的相关技能,并根据目标任务动态分配组合专家(即提示集合)。此外,大量实验表明,MoPs对任何为提升效率而应用的模型压缩技术、指令数据源及任务构成均保持经验无关性。实践中,MoPs可在多任务、多源场景(如跨源任务与数据异构性)下同步缓解提示训练中的“干扰”问题,并有效应对模型近似可能带来的影响。值得强调的是,与基准方法相比,MoPs在联邦场景中将最终困惑度降低约20%至70%,在中心化场景中降低约3%至30%。