Large language models (LLMs) effectively generate fluent text when the target output follows natural language patterns. However, structured prediction tasks confine the output format to a limited ontology, causing even very large models to struggle since they were never trained with such restrictions in mind. The difficulty of using LLMs for direct prediction is exacerbated in few-shot learning scenarios, which commonly arise due to domain shift and resource limitations. We flip the problem on its head by leveraging the LLM as a tool for data augmentation rather than direct prediction. Our proposed Mixture of Soft Prompts (MSP) serves as a parameter-efficient procedure for generating data in a controlled manner. Denoising mechanisms are further applied to improve the quality of synthesized data. Automatic metrics show our method is capable of producing diverse and natural text, while preserving label semantics. Moreover, MSP achieves state-of-the-art results on three benchmarks when compared against strong baselines. Our method offers an alternate data-centric approach for applying LLMs to complex prediction tasks.
翻译:大型语言模型(LLMs)在目标输出遵循自然语言模式时能够高效生成流畅文本。然而,结构化预测任务将输出格式限制在有限的本体范围内,导致即使规模极大的模型也难以应对,因为这些模型在训练过程中从未考虑过此类约束。在少样本学习场景下,使用LLMs进行直接预测的困难更加突出,而少样本学习通常因领域迁移和资源限制而产生。我们通过将LLMs作为数据增强工具而非直接预测手段,从根本上转换了问题视角。所提出的混合软提示(MSP)作为一种参数高效的方法,能够以受控方式生成数据。进一步应用去噪机制以提升合成数据的质量。自动评估指标显示,我们的方法既能生成多样且自然的文本,又能保留标签语义。此外,在与强基线方法对比的三大基准测试中,MSP取得了最先进的性能。该方法为将LLMs应用于复杂预测任务提供了一种以数据为中心的替代路径。