Building conversational agents that can have natural and knowledge-grounded interactions with humans requires understanding user utterances. Entity Linking (EL) is an effective and widely used method for understanding natural language text and connecting it to external knowledge. It is, however, shown that existing EL methods developed for annotating documents are suboptimal for conversations, where personal entities (e.g., "my cars") and concepts are essential for understanding user utterances. In this paper, we introduce a collection and a tool for entity linking in conversations. We collect EL annotations for 1327 conversational utterances, consisting of links to named entities, concepts, and personal entities. The dataset is used for training our toolkit for conversational entity linking, CREL. Unlike existing EL methods, CREL is developed to identify both named entities and concepts. It also utilizes coreference resolution techniques to identify personal entities and references to the explicit entity mentions in the conversations. We compare CREL with state-of-the-art techniques and show that it outperforms all existing baselines.
翻译:构建能够与人类进行自然且基于知识交互的对话智能体,需要理解用户的表述。实体链接(EL)是一种有效且广泛使用的自然语言文本理解及外部知识连接方法。然而,现有为文档注释开发的EL方法在对话场景中表现欠佳——理解用户表述时,个人实体(如“我的车”)和概念至关重要。本文介绍了一种用于对话实体链接的数据集与工具。我们为1327条对话语句收集了EL注释,包含对命名实体、概念和个人实体的链接。该数据集用于训练我们的对话实体链接工具包CREL。与现有EL方法不同,CREL旨在同时识别命名实体和概念,并利用共指消解技术识别个人实体及其在对话中对应显式实体提及的指代关系。我们将CREL与最新技术进行对比,结果表明其优于所有现有基线模型。