Mapping ongoing news headlines to event-related classes in a rich knowledge base can be an important component in a knowledge-based event analysis and forecasting solution. In this paper, we present a methodology for creating a benchmark dataset of news headlines mapped to event classes in Wikidata, and resources for the evaluation of methods that perform the mapping. We use the dataset to study two classes of unsupervised methods for this task: 1) adaptations of classic entity linking methods, and 2) methods that treat the problem as a zero-shot text classification problem. For the first approach, we evaluate off-the-shelf entity linking systems. For the second approach, we explore a) pre-trained natural language inference (NLI) models, and b) pre-trained large generative language models. We present the results of our evaluation, lessons learned, and directions for future work. The dataset and scripts for evaluation are made publicly available.
翻译:将实时新闻标题映射到富知识库中的事件相关类,是基于知识的分析与预测解决方案的重要组成部分。本文提出了一种方法,用于创建将新闻标题映射到维基数据事件类的基准数据集,并提供评估映射方法的资源。我们利用该数据集研究了两类无监督方法:1)经典实体链接方法的适配,以及2)将问题视为零样本文本分类的方法。针对第一种方法,我们评估了现成的实体链接系统;针对第二种方法,我们探索了a)预训练自然语言推理模型和b)预训练大型生成语言模型。我们展示了评估结果、经验教训及未来工作方向。相关数据集和评估脚本已公开发布。