Large Language Models (LLMs) have the ability to solve a variety of tasks, such as text summarization and mathematical questions, just out of the box, but they are often trained with a single task in mind. Due to high computational costs, the current trend is to use prompt instruction tuning to better adjust monolithic, pretrained LLMs for new -- but often individual -- downstream tasks. Thus, how one would expand prompt tuning to handle -- concomitantly -- heterogeneous tasks and data distributions is a widely open question. To address this gap, we suggest the use of \emph{Mixture of Prompts}, or MoPs, associated with smart gating functionality: the latter -- whose design is one of the contributions of this paper -- can identify relevant skills embedded in different groups of prompts and dynamically assign combined experts (i.e., collection of prompts), based on the target task. Additionally, MoPs are empirically agnostic to any model compression technique applied -- for efficiency reasons -- as well as instruction data source and task composition. In practice, MoPs can simultaneously mitigate prompt training "interference" in multi-task, multi-source scenarios (e.g., task and data heterogeneity across sources), as well as possible implications from model approximations. As a highlight, MoPs manage to decrease final perplexity from $\sim20\%$ up to $\sim70\%$, as compared to baselines, in the federated scenario, and from $\sim 3\%$ up to $\sim30\%$ in the centralized scenario.
翻译:大型语言模型(LLMs)可直接解决文本摘要、数学问答等多种任务,但其通常针对单一任务进行训练。由于计算成本高昂,当前趋势是通过提示指令调优,使预训练的通用LLM更好地适应特定(且往往是独立的)下游任务。然而,如何扩展提示调优以同时处理异构任务与数据分布,仍是一个开放性问题。为填补这一空白,我们提出基于智能门控机制的"提示混合"(MoPs)方法——本文设计的新型门控机制可识别不同提示组中嵌入的相关技能,并根据目标任务动态分配组合专家(即提示集合)。实证表明,MoPs对因效率需求而应用的模型压缩技术、指令数据来源及任务组成均具有鲁棒性。实际应用中,MoPs可同时缓解多任务多源场景(如跨来源的任务与数据异构性)中的提示训练"干扰",以及模型近似带来的潜在影响。值得强调的是,相比基线方法,MoPs在联邦场景中将最终困惑度降低约20%至70%,在集中场景中降低约3%至30%。