Large language models (LLMs) effectively generate fluent text when the target output follows natural language patterns. However, structured prediction tasks confine the output format to a limited ontology, causing even very large models to struggle since they were never trained with such restrictions in mind. The difficulty of using LLMs for direct prediction is exacerbated in few-shot learning scenarios, which commonly arise due to domain shift and resource limitations. We flip the problem on its head by leveraging the LLM as a tool for data augmentation rather than direct prediction. Our proposed Mixture of Soft Prompts (MSP) serves as a parameter-efficient procedure for generating data in a controlled manner. Denoising mechanisms are further applied to improve the quality of synthesized data. Automatic metrics show our method is capable of producing diverse and natural text, while preserving label semantics. Moreover, MSP achieves state-of-the-art results on three benchmarks when compared against strong baselines. Our method offers an alternate data-centric approach for applying LLMs to complex prediction tasks.
翻译:大型语言模型(LLMs)在目标输出符合自然语言模式时能有效生成流畅文本。然而,结构化预测任务将输出格式限制在有限的本体范围内,即使非常大的模型也因从未受过此类约束训练而难以应对。利用LLMs进行直接预测的困难在少样本学习场景中尤为突出,这种场景通常由领域迁移和资源限制引发。我们通过将LLM作为数据增强工具而非直接预测手段,从根本上扭转了问题。我们提出的软提示混合(Mixture of Soft Prompts, MSP)是一种参数高效的过程,用于以可控方式生成数据。进一步采用去噪机制以提升合成数据质量。自动评估指标表明,该方法能在保持标签语义的同时生成多样且自然的文本。此外,在三个基准测试中,MSP相较于强基线方法实现了最优结果。我们的方法为将LLMs应用于复杂预测任务提供了一种以数据为中心的替代方案。