Relation extraction aims to classify the relationships between two entities into pre-defined categories. While previous research has mainly focused on sentence-level relation extraction, recent studies have expanded the scope to document-level relation extraction. Traditional relation extraction methods heavily rely on human-annotated training data, which is time-consuming and labor-intensive. To mitigate the need for manual annotation, recent weakly-supervised approaches have been developed for sentence-level relation extraction while limited work has been done on document-level relation extraction. Weakly-supervised document-level relation extraction faces significant challenges due to an imbalanced number "no relation" instances and the failure of directly probing pretrained large language models for document relation extraction. To address these challenges, we propose PromptRE, a novel weakly-supervised document-level relation extraction method that combines prompting-based techniques with data programming. Furthermore, PromptRE incorporates the label distribution and entity types as prior knowledge to improve the performance. By leveraging the strengths of both prompting and data programming, PromptRE achieves improved performance in relation classification and effectively handles the "no relation" problem. Experimental results on ReDocRED, a benchmark dataset for document-level relation extraction, demonstrate the superiority of PromptRE over baseline approaches.
翻译:关系抽取旨在将两个实体之间的关系分类为预定义的类别。以往研究主要关注句子级关系抽取,而近期研究已将范围扩展至文档级关系抽取。传统关系抽取方法严重依赖人工标注的训练数据,耗时且劳动密集。为减少对人工标注的需求,近期针对句子级关系抽取开发了弱监督方法,但在文档级关系抽取方面的相关工作有限。弱监督文档级关系抽取面临显著挑战,包括“无关系”实例数量不平衡,以及直接探测预训练大语言模型进行文档关系抽取的失败。为应对这些挑战,我们提出PromptRE,一种结合提示技术与数据编程的新型弱监督文档级关系抽取方法。此外,PromptRE融入标签分布和实体类型作为先验知识以提升性能。通过结合提示和数据编程的优势,PromptRE在关系分类中取得了更优性能,并有效处理了“无关系”问题。在文档级关系抽取基准数据集ReDocRED上的实验结果表明,PromptRE相较于基线方法具有优越性。