Commonsense reasoning is omnipresent in human communications and thus is an important feature for open-domain dialogue systems. However, evaluating commonsense in dialogue systems is still an open challenge. We take the first step by focusing on event commonsense that considers events and their relations, and is crucial in both dialogues and general commonsense reasoning. We propose ACCENT, an event commonsense evaluation metric empowered by commonsense knowledge bases (CSKBs). ACCENT first extracts event-relation tuples from a dialogue, and then evaluates the response by scoring the tuples in terms of their compatibility with the CSKB. To evaluate ACCENT, we construct the first public event commonsense evaluation dataset for open-domain dialogues. Our experiments show that ACCENT is an efficient metric for event commonsense evaluation, which achieves higher correlations with human judgments than existing baselines.
翻译:常识推理在人类交流中普遍存在,因此是开放域对话系统的重要特性。然而,对话系统中的常识评估仍是一个开放挑战。我们通过聚焦于事件常识(即考虑事件及其关系,在对话与通用常识推理中均至关重要)迈出了第一步。我们提出ACCENT——一种由常识知识库(CSKB)赋能的事件常识评估指标。ACCENT首先从对话中提取事件-关系元组,然后通过评估这些元组与CSKB的兼容性来对回复进行评分。为评估ACCENT,我们构建了首个面向开放域对话的公共事件常识评估数据集。实验表明,ACCENT是一种高效的事件常识评估指标,其与人工判断的相关性高于现有基线方法。