Prompt engineering has made significant contributions to the era of large language models, yet its effectiveness depends on the skills of a prompt author. Automatic prompt optimization can support the prompt development process, but requires annotated data. This paper introduces $\textit{iPrOp}$, a novel Interactive Prompt Optimization system, to bridge manual prompt engineering and automatic prompt optimization. With human intervention in the optimization loop, $\textit{iPrOp}$ offers users the flexibility to assess evolving prompts. We present users with prompt variations, selected instances, large language model predictions accompanied by corresponding explanations, and performance metrics derived from a subset of the training data. This approach empowers users to choose and further refine the provided prompts based on their individual preferences and needs. This system not only assists non-technical domain experts in generating optimal prompts tailored to their specific tasks or domains, but also enables to study the intrinsic parameters that influence the performance of prompt optimization. Our evaluation shows that our system has the capability to generate improved prompts, leading to enhanced task performance.
翻译:提示工程为大语言模型时代做出了重要贡献,但其效果依赖于提示编写者的技能。自动提示优化能够支持提示开发过程,但需要标注数据。本文提出$\textit{iPrOp}$,一种新颖的交互式提示优化系统,以弥合手动提示工程与自动提示优化之间的鸿沟。通过在优化循环中引入人工干预,$\textit{iPrOp}$为用户提供了评估演化中提示的灵活性。我们向用户展示提示变体、精选实例、大语言模型预测及其相应解释,以及基于训练数据子集计算的性能指标。该方法使用户能够根据个人偏好和需求,选择并进一步优化所提供的提示。该系统不仅帮助非技术领域专家生成针对其特定任务或领域优化的提示,还能用于研究影响提示优化性能的内在参数。我们的评估表明,该系统能够生成改进的提示,从而提升任务性能。