Prompt quality plays a critical role in the performance of large language models (LLMs), motivating a growing body of work on prompt optimization. Most existing methods optimize prompts over a fixed dataset, assuming static input distributions and offering limited support for iterative improvement. We introduce SIPDO (Self-Improving Prompts through Data-Augmented Optimization), a closed-loop framework for prompt learning that integrates synthetic data generation into the optimization process. SIPDO couples a synthetic data generator with a prompt optimizer, where the generator produces new examples that reveal current prompt weaknesses and the optimizer incrementally refines the prompt in response. This feedback-driven loop enables systematic improvement of prompt performance without assuming access to external supervision or new tasks. Experiments across question answering and reasoning benchmarks show that SIPDO outperforms standard prompt tuning methods, highlighting the value of integrating data synthesis into prompt learning workflows.
翻译:提示质量对大型语言模型(LLM)的性能具有关键影响,这推动了日益增多的提示优化研究。现有方法大多基于固定数据集进行提示优化,其假设输入分布是静态的,且对迭代改进的支持有限。本文提出SIPDO(基于数据增强优化的自改进提示),这是一种将合成数据生成融入优化过程的闭环提示学习框架。SIPDO将合成数据生成器与提示优化器耦合:生成器通过生成揭示当前提示缺陷的新样本,优化器则据此对提示进行增量式改进。这种反馈驱动的循环机制能够在无需外部监督或新任务的前提下,实现提示性能的系统性提升。在问答与推理基准测试上的实验表明,SIPDO优于标准提示调优方法,凸显了将数据合成融入提示学习工作流程的价值。