Large-scale open-domain dialogue systems such as PLATO-2 have achieved state-of-the-art scores in both English and Chinese. However, little work explores whether such dialogue systems also work well in the Japanese language. In this work, we create a large-scale Japanese dialogue dataset, Dialogue-Graph, which contains 1.656 million dialogue data in a tree structure from News, TV subtitles, and Wikipedia corpus. Then, we train PLATO-2 using Dialogue-Graph to build a large-scale Japanese dialogue system, PLATO-JDS. In addition, to improve the PLATO-JDS in the topic switch issue, we introduce a topic-switch algorithm composed of a topic discriminator to switch to a new topic when user input differs from the previous topic. We evaluate the user experience by using our model with respect to four metrics, namely, coherence, informativeness, engagingness, and humanness. As a result, our proposed PLATO-JDS achieves an average score of 1.500 for the human evaluation with human-bot chat strategy, which is close to the maximum score of 2.000 and suggests the high-quality dialogue generation capability of PLATO-2 in Japanese. Furthermore, our proposed topic-switch algorithm achieves an average score of 1.767 and outperforms PLATO-JDS by 0.267, indicating its effectiveness in improving the user experience of our system.
翻译:以PLATO-2为代表的大规模开放域对话系统在英文和中文环境中均取得了当前最优成绩,但鲜有研究探讨此类对话系统在日语中的适用性。本研究构建了一个大规模日语对话数据集——对话图谱(Dialogue-Graph),该数据集包含从新闻、电视字幕及维基百科语料中提取的165.6万条树形结构对话数据。随后,我们利用对话图谱训练PLATO-2模型,构建了大规模日语对话系统PLATO-JDS。为改善该系统的主题切换问题,我们提出了一种基于主题判别器的主题切换算法:当用户输入内容与当前主题不符时,该算法能够自动切换至新主题。我们采用连贯性、信息量、参与度和人性化四个指标对模型进行用户体验评估。实验结果显示,采用人机对话策略时,PLATO-JDS的人工评估平均得分为1.500分(满分2.000分),表明PLATO-2具备高质量的日语对话生成能力。此外,所提主题切换算法平均得分为1.767分,较PLATO-JDS提升0.267分,证实了该算法在优化系统用户体验方面的有效性。