Autoregressive models used to generate responses in open-domain dialogue systems often struggle to take long-term context into account and to maintain consistency over a dialogue. Previous research in open-domain dialogue generation has shown that the use of \emph{auxiliary tasks} can introduce inductive biases that encourage the model to improve these qualities. However, most previous research has focused on encoder-only or encoder/decoder models, while the use of auxiliary tasks in \emph{decoder-only} autoregressive models is under-explored. This paper describes an investigation where four different auxiliary tasks are added to small and medium-sized GPT-2 models fine-tuned on the PersonaChat and DailyDialog datasets. The results show that the introduction of the new auxiliary tasks leads to small but consistent improvement in evaluations of the investigated models.
翻译:用于在开放域对话系统中生成回复的自回归模型通常难以考虑长期上下文并保持对话一致性。先前关于开放域对话生成的研究表明,使用辅助任务可以引入归纳偏置,促使模型改进这些特性。然而,以往大多数研究集中在仅编码器或编码器/解码器模型上,而辅助任务在仅解码器自回归模型中的应用尚未得到充分探索。本文描述了一项研究,在基于PersonaChat和DailyDialog数据集微调的小型和中型GPT-2模型上添加了四种不同的辅助任务。实验结果表明,引入新的辅助任务在被研究的模型评估中带来了微小但一致的改进。