Recently, numerous efforts have continued to push up performance boundaries of document-level relation extraction (DocRE) and have claimed significant progress in DocRE. In this paper, we do not aim at proposing a novel model for DocRE. Instead, we take a closer look at the field to see if these performance gains are actually true. By taking a comprehensive literature review and a thorough examination of popular DocRE datasets, we find that these performance gains are achieved upon a strong or even untenable assumption in common: all named entities are perfectly localized, normalized, and typed in advance. Next, we construct four types of entity mention attacks to examine the robustness of typical DocRE models by behavioral probing. We also have a close check on model usability in a more realistic setting. Our findings reveal that most of current DocRE models are vulnerable to entity mention attacks and difficult to be deployed in real-world end-user NLP applications. Our study calls more attentions for future research to stop simplifying problem setups, and to model DocRE in the wild rather than in an unrealistic Utopian world.
翻译:近年来,众多研究持续推动文档级关系抽取(DocRE)的性能边界,并声称取得了显著进展。本文并非旨在提出新的DocRE模型,而是对该领域进行深入审视,检验这些性能提升是否真实有效。通过全面的文献综述和对主流DocRE数据集的细致考察,我们发现这些性能提升建立在一种共同且过于强势甚至站不住脚的假设之上:所有命名实体均事先被完美定位、归一化并确定类型。接着,我们构建四种类型的实体提及攻击,通过行为探测检验典型DocRE模型的鲁棒性。我们还在更现实的场景中密切考察了模型的实用性。研究结果表明,当前多数DocRE模型对实体提及攻击十分脆弱,难以部署于实际端用户自然语言处理应用中。本项研究呼吁未来研究关注简化问题设置的弊端,对真实环境而非乌托邦式理想场景中的DocRE进行建模。