Developing an efficient retriever to retrieve knowledge from a large-scale knowledge base (KB) is critical for task-oriented dialogue systems to effectively handle localized and specialized tasks. However, widely used generative models such as T5 and ChatGPT often struggle to differentiate subtle differences among the retrieved KB records when generating responses, resulting in suboptimal quality of generated responses. In this paper, we propose the application of maximal marginal likelihood to train a perceptive retriever by utilizing signals from response generation for supervision. In addition, our approach goes beyond considering solely retrieved entities and incorporates various meta knowledge to guide the generator, thus improving the utilization of knowledge. We evaluate our approach on three task-oriented dialogue datasets using T5 and ChatGPT as the backbone models. The results demonstrate that when combined with meta knowledge, the response generator can effectively leverage high-quality knowledge records from the retriever and enhance the quality of generated responses. The codes and models of this paper are available at https://github.com/shenwzh3/MK-TOD.
翻译:开发高效的检索器以从大规模知识库中检索知识,对于任务型对话系统有效处理本地化和专业化任务至关重要。然而,广泛使用的生成模型(如T5和ChatGPT)在生成响应时,往往难以区分所检索知识记录中的细微差异,导致生成的响应质量欠佳。本文提出利用最大边际似然法,通过响应生成信号监督来训练感知型检索器。此外,我们的方法不仅考虑检索到的实体,还整合多种元知识以指导生成器,从而提升知识利用效率。我们在三个任务型对话数据集上,以T5和ChatGPT作为骨干模型进行了评估。结果表明,结合元知识后,响应生成器能够有效利用检索器提供的高质量知识记录,并提升生成响应的质量。本文的代码和模型可在https://github.com/shenwzh3/MK-TOD获取。