Large Language Model (LLM) has demonstrated significant ability in various Natural Language Processing tasks. However, their effectiveness is highly dependent on the phrasing of the task prompt, leading to research on automatic prompt optimization using labeled task data. We reveal that these prompt optimization techniques are vulnerable to distribution shifts such as subpopulation shifts, which are common for LLMs in real-world scenarios such as customer reviews analysis. In this light, we propose a new problem of robust prompt optimization for LLMs against distribution shifts, which requires the prompt optimized over the labeled source group can simultaneously generalize to an unlabeled target group. To solve this problem, we propose Generalized Prompt Optimization framework, which incorporates the unlabeled data from the target group into prompt optimization. Extensive experimental results demonstrate the effectiveness of the proposed framework with significant performance improvement on the target group and comparable performance on the source group.
翻译:大语言模型(LLM)在各种自然语言处理任务中展现了显著能力。然而,其有效性高度依赖于任务提示的措辞,从而引发了利用标注任务数据自动优化提示的研究。我们揭示了这些提示优化技术易受子群体偏移等分布偏移的影响,而这类偏移在客户评论分析等真实场景下的LLM应用中十分常见。基于此,我们提出了一项新问题:面向分布偏移的鲁棒提示优化,要求基于标注源群体优化的提示能同时泛化至未标注的目标群体。为解决该问题,我们提出了通用提示优化框架,该框架将目标群体的未标注数据融入提示优化过程。大量实验结果表明,所提框架在目标群体上实现了显著的性能提升,同时在源群体上保持可比性能,验证了其有效性。