We introduce a novel graph-based framework for alleviating key challenges in distantly-supervised relation extraction and demonstrate its effectiveness in the challenging and important domain of biomedical data. Specifically, we propose a graph view of sentence bags referring to an entity pair, which enables message-passing based aggregation of information related to the entity pair over the sentence bag. The proposed framework alleviates the common problem of noisy labeling in distantly supervised relation extraction and also effectively incorporates inter-dependencies between sentences within a bag. Extensive experiments on two large-scale biomedical relation datasets and the widely utilized NYT dataset demonstrate that our proposed framework significantly outperforms the state-of-the-art methods for biomedical distant supervision relation extraction while also providing excellent performance for relation extraction in the general text mining domain.
翻译:我们提出了一种新颖的基于图的框架,用于缓解远距离监督关系抽取中的关键挑战,并在具有挑战性且重要的生物医学数据领域展示了其有效性。具体而言,我们针对指向实体对的句子包提出了一种图视角,使得与该实体对相关的信息能够通过基于消息传递的方式在句子包内进行聚合。所提出的框架缓解了远距离监督关系抽取中常见的噪声标注问题,并有效整合了句子包内句子间的相互依赖关系。在两个大规模生物医学关系数据集以及广泛使用的NYT数据集上进行的大量实验表明,我们提出的框架在生物医学远距离监督关系抽取任务上显著优于现有最优方法,同时在常规文本挖掘领域的关系抽取中也表现出卓越性能。