This paper presents a scheme for annotating coreference across news articles, extending beyond traditional identity relations by also considering near-identity and bridging relations. It includes a precise description of how to set up Inception, a respective annotation tool, how to annotate entities in news articles, connect them with diverse coreferential relations, and link them across documents to Wikidata's global knowledge graph. This multi-layered annotation approach is discussed in the context of the problem of media bias. Our main contribution lies in providing a methodology for creating a diverse cross-document coreference corpus which can be applied to the analysis of media bias by word-choice and labelling.
翻译:本文提出了一种跨新闻文章共指标注方案,该方案超越了传统同一性关系,同时考虑了近似同一性和桥接关系。方案详细描述了如何配置注释工具Inception、标注新闻文章中的实体、用多样的共指关系连接这些实体,以及将它们跨文档链接至Wikidata全局知识图谱。本文结合媒体偏见问题讨论了这种多层标注方法。我们的主要贡献在于提供了一种创建多样化跨文档共指语料库的方法,该方法可通过词汇选择与标签化分析媒体偏见。