Dialogue act annotations are important to improve response generation quality in task-oriented dialogue systems. However, it can be challenging to use dialogue acts to control response generation in a generalizable way because different datasets and tasks may have incompatible annotations. While alternative methods that utilize latent action spaces or reinforcement learning do not require explicit annotations, they may lack interpretability or face difficulties defining task-specific rewards. In this work, we present a novel end-to-end latent dialogue act model (DiactTOD) that represents dialogue acts in a latent space. DiactTOD, when pre-trained on a large corpus, is able to predict and control dialogue acts to generate controllable responses using these latent representations in a zero-shot fashion. Our approach demonstrates state-of-the-art performance across a wide range of experimental settings on the MultiWOZ dataset, including zero-shot, few-shot, and full data fine-tuning with both end-to-end and policy optimization configurations.
翻译:对话行为标注对于提升任务导向对话系统中响应生成质量至关重要。然而,由于不同数据集和任务可能采用不兼容的标注体系,利用对话行为以可泛化方式控制响应生成具有挑战性。尽管采用潜在动作空间或强化学习的替代方法无需显式标注,但它们可能缺乏可解释性,或在定义任务特定奖励时面临困难。本文提出一种新型端到端潜在对话行为模型(DiactTOD),在潜在空间中表示对话行为。该模型在大规模语料上预训练后,能够以零样本方式利用这些潜在表示预测并控制对话行为,从而生成可控响应。在MultiWOZ数据集上的广泛实验设置(包括零样本、少样本及全数据微调,结合端到端与策略优化配置)中,我们的方法取得了最先进的性能。