While automatic prompt generation methods have recently received significant attention, their robustness remains poorly understood. In this paper, we introduce PertBench, a comprehensive benchmark dataset that includes a wide range of input perturbations, designed to systematically evaluate the robustness of current auto-prompting techniques. Our analysis reveals substantial vulnerabilities in existing prompt generation strategies, where even minor modifications to the prompt can lead to significant differences in model output. To address this issue, we propose PGO, a gradient-free prompt generation framework that leverages perturbation types as pseudo-gradient signals to guide LLMs in producing more robust prompts. In contrast to existing methods that assess prompt quality only on clean, well-structured inputs, our approach explicitly emphasizes robustness under noisy and perturbed conditions. Extensive experiments across diverse tasks and multiple LLMs show PGO consistently outperforms previous methods in maintaining performance under input perturbations.
翻译:尽管自动提示生成方法近来受到广泛关注,但其鲁棒性仍未得到充分理解。本文提出PertBench——一个包含多种输入扰动的综合性基准数据集,旨在系统评估当前自动提示技术的鲁棒性。我们的分析揭示了现有提示生成策略存在显著脆弱性:即使对提示进行微小修改,也可能导致模型输出产生显著差异。为解决这一问题,我们提出PGO——一种无梯度提示生成框架,该框架利用扰动类型作为伪梯度信号来引导大语言模型生成更具鲁棒性的提示。与现有方法仅在干净、结构良好的输入上评估提示质量不同,我们的方法明确强调在噪声和扰动条件下的鲁棒性。跨多种任务和多个大语言模型的广泛实验表明,PGO在输入扰动下保持性能的表现持续优于现有方法。