DRO-InstructZero: Distributionally Robust Prompt Optimization for Large Language Models

Large language models are highly sensitive to prompt wording. However, popular automatic prompt search methods, including InstructZero, often degrade under distribution shift and adversarial evaluation because they optimize expected performance under a single evaluation distribution. Consequently, prompts that work in one setting frequently fail to transfer. To address this, DRO-InstructZero formulates zero-shot prompt optimization as robust Bayesian optimization. Specifically, an f-divergence ball defines an ambiguity set around the evaluation distribution, and a robust acquisition rule maximizes worst-case expected utility while retaining the query efficiency of Bayesian search. Therefore, the search explicitly targets reliability under distribution shift rather than average behavior alone. Experiments follow the instruction-induction protocol with matched query budgets across formality rewriting, code debugging, and translation. For example, on BIG-Bench informative-to-formal rewriting, accuracy improves from 61.3 +/- 0.7% to approximately 85-90%, yielding an absolute gain of about 25-30 points. Moreover, auto-debugging shows about +25-point gains under domain shift. Meanwhile, stable tasks such as cause-and-effect remain above 96%, indicating no loss on in-distribution cases. Furthermore, improvements are consistent across divergence choices and decoding temperatures. Overall, DRO-InstructZero connects distributionally robust optimization with prompt learning, offering a plug-and-play and general approach for reliable, transferable prompt alignment under real-world uncertainty.

翻译：大语言模型对提示措辞高度敏感。然而，当前流行的自动提示搜索方法（包括InstructZero）在分布偏移和对抗性评估下性能往往下降，因为它们仅在单一评估分布下优化期望性能。因此，在一个场景中有效的提示通常无法迁移到其他场景。为解决此问题，DRO-InstructZero将零样本提示优化建模为鲁棒贝叶斯优化问题。具体而言，该方法通过f-散度球在评估分布周围定义模糊集，并采用鲁棒采集规则在保持贝叶斯搜索查询效率的同时最大化最坏情况期望效用。因此，搜索过程明确针对分布偏移下的可靠性而非仅优化平均性能。实验遵循指令归纳协议，在形式改写、代码调试和翻译任务中保持相同查询预算。例如，在BIG-Bench信息性到正式文本改写任务中，准确率从61.3 +/- 0.7%提升至约85-90%，获得约25-30个百分点的绝对增益。此外，在领域偏移下的自动调试任务中显示出约25个百分点的性能提升。同时，因果推理等稳定任务仍保持96%以上的准确率，表明在分布内案例上未出现性能损失。改进效果在不同散度选择和解码温度下保持一致。总体而言，DRO-InstructZero将分布鲁棒性优化与提示学习相结合，为现实世界不确定性下的可靠、可迁移提示对齐提供了一种即插即用的通用方法。