Open Relation Extraction (OpenRE) aims to discover novel relations from open domains. Previous OpenRE methods mainly suffer from two problems: (1) Insufficient capacity to discriminate between known and novel relations. When extending conventional test settings to a more general setting where test data might also come from seen classes, existing approaches have a significant performance decline. (2) Secondary labeling must be performed before practical application. Existing methods cannot label human-readable and meaningful types for novel relations, which is urgently required by the downstream tasks. To address these issues, we propose the Active Relation Discovery (ARD) framework, which utilizes relational outlier detection for discriminating known and novel relations and involves active learning for labeling novel relations. Extensive experiments on three real-world datasets show that ARD significantly outperforms previous state-of-the-art methods on both conventional and our proposed general OpenRE settings. The source code and datasets will be available for reproducibility.
翻译:开放关系抽取旨在从开放领域中发现新的关系。现有的开放关系抽取方法主要存在两个问题:(1)区分已知关系和新关系的判别能力不足。当将传统的测试设置扩展至更通用的场景(即测试数据可能包含来自已知类别的样本)时,现有方法的性能会显著下降。(2)在实际应用前必须进行二次标注。现有方法无法为新关系标注人类可读且有意义的类型标签,而这正是下游任务的迫切需求。为解决这些问题,我们提出了主动关系发现(ARD)框架,该框架利用关系异常检测来区分已知关系和新关系,并引入主动学习来为新关系标注标签。在三个真实数据集上的大量实验表明,无论是在传统设置还是在我们提出的通用开放关系抽取设置下,ARD均显著优于先前的最优方法。源代码和数据集将公开以支持可复现性。