In task-oriented dialogue, a system often needs to follow a sequence of actions, called a workflow, that complies with a set of guidelines in order to complete a task. In this paper, we propose the novel problem of multi-step workflow action prediction, in which the system predicts multiple future workflow actions. Accurate prediction of multiple steps allows for multi-turn automation, which can free up time to focus on more complex tasks. We propose three modeling approaches that are simple to implement yet lead to more action automation: 1) fine-tuning on a training dataset, 2) few-shot in-context learning leveraging retrieval and large language model prompting, and 3) zero-shot graph traversal, which aggregates historical action sequences into a graph for prediction. We show that multi-step action prediction produces features that improve accuracy on downstream dialogue tasks like predicting task success, and can increase automation of steps by 20% without requiring as much feedback from a human overseeing the system.
翻译:在任务导向型对话中,系统通常需要遵循一系列称为工作流的动作,这些动作需符合一组指导原则以完成任务。本文提出多步工作流动作预测这一新问题,即系统预测多个未来工作流动作。精确的多步预测可实现多轮自动化,从而释放时间以聚焦于更复杂任务。我们提出三种易于实现且能提升动作自动化程度的建模方法:1)基于训练数据集的微调,2)利用检索与大语言模型提示的少样本上下文学习,3)零样本图遍历,通过将历史动作序列聚合为图进行预测。研究表明,多步动作预测产生的特征能提升下游对话任务(如任务成功预测)的准确率,且可在无需人类监督系统频繁反馈的情况下,将步骤自动化率提升20%。