Document-level relation extraction (DocRE) aims to extract relations of all entity pairs in a document. A key challenge in DocRE is the cost of annotating such data which requires intensive human effort. Thus, we investigate the case of DocRE in a low-resource setting, and we find that existing models trained on low data overestimate the NA ("no relation") label, causing limited performance. In this work, we approach the problem from a calibration perspective and propose PRiSM, which learns to adapt logits based on relation semantic information. We evaluate our method on three DocRE datasets and demonstrate that integrating existing models with PRiSM improves performance by as much as 26.38 F1 score, while the calibration error drops as much as 36 times when trained with about 3% of data. The code is publicly available at https://github.com/brightjade/PRiSM.
翻译:文档级关系抽取(DocRE)旨在提取文档中所有实体对的关系。DocRE的一个关键挑战是标注此类数据需要大量人工努力,成本高昂。因此,我们研究了低资源场景下的DocRE问题,并发现现有模型在少量数据上训练时会高估NA("无关系")标签,导致性能受限。本文从校准视角出发,提出PRiSM方法,该方法基于关系语义信息学习调整logits。我们在三个DocRE数据集上评估了该方法,结果表明,将现有模型与PRiSM集成后,在使用约3%数据训练时,F1分数最高可提升26.38,同时校准误差降低至原来的1/36。代码已开源在https://github.com/brightjade/PRiSM。