As information extraction (IE) systems have grown more adept at processing whole documents, the classic task of template filling has seen renewed interest as benchmark for document-level IE. In this position paper, we call into question the suitability of template filling for this purpose. We argue that the task demands definitive answers to thorny questions of event individuation -- the problem of distinguishing distinct events -- about which even human experts disagree. Through an annotation study and error analysis, we show that this raises concerns about the usefulness of template filling metrics, the quality of datasets for the task, and the ability of models to learn it. Finally, we consider possible solutions.
翻译:随着信息抽取系统在处理整篇文档方面日益成熟,模板填充这一经典任务作为文档级信息抽取的基准重新引起关注。在本立场论文中,我们对模板填充在此用途上的适用性提出质疑。我们认为,该任务需要对事件个体化这一棘手问题——即区分不同事件——给出权威答案,而即便人类专家对此也意见不一。通过一项标注研究和错误分析,我们指出这引发了关于模板填充指标有效性、任务数据集质量以及模型学习该任务能力的担忧。最后,我们探讨了可能的解决方案。