Mixed-initiative dialogue tasks involve repeated exchanges of information and conversational control. Conversational agents gain control by generating responses that follow particular dialogue intents or strategies, prescribed by a policy planner. The standard approach has been fine-tuning pre-trained language models to perform generation conditioned on these intents. However, these supervised generation models are limited by the cost and quality of data annotation. We instead prompt large language models as a drop-in replacement to fine-tuning on conditional generation. We formalize prompt construction for controllable mixed-initiative dialogue. Our findings show improvements over fine-tuning and ground truth responses according to human evaluation and automatic metrics for two tasks: PersuasionForGood and Emotional Support Conversations.
翻译:混合主动式对话任务涉及反复的信息交换与对话控制。对话代理通过生成遵循特定对话意图或策略(由策略规划器指定)的响应来获取控制权。标准方法是对预训练语言模型进行微调,使其能够基于这些意图进行条件生成。然而,这类监督生成模型受限于数据标注的成本与质量。我们转而采用对大语言模型进行提示的方法,作为条件生成微调的直接替代方案。我们形式化了可控混合主动式对话的提示构建过程。研究结果表明,在PersuasionForGood与Emotional Support Conversations两个任务上,我们的方法在人类评估和自动评估指标上均优于微调模型和真实响应。