We introduce a meta dataset for few-shot relation extraction, which includes two datasets derived from existing supervised relation extraction datasets NYT29 (Takanobu et al., 2019; Nayak and Ng, 2020) and WIKIDATA (Sorokin and Gurevych, 2017) as well as a few-shot form of the TACRED dataset (Sabo et al., 2021). Importantly, all these few-shot datasets were generated under realistic assumptions such as: the test relations are different from any relations a model might have seen before, limited training data, and a preponderance of candidate relation mentions that do not correspond to any of the relations of interest. Using this large resource, we conduct a comprehensive evaluation of six recent few-shot relation extraction methods, and observe that no method comes out as a clear winner. Further, the overall performance on this task is low, indicating substantial need for future research. We release all versions of the data, i.e., both supervised and few-shot, for future research.
翻译:我们引入了一个用于小样本关系抽取的元数据集,该数据集包含两个源自现有监督关系抽取数据集NYT29(Takanobu等人,2019;Nayak和Ng,2020)和WIKIDATA(Sorokin和Gurevych,2017)的数据集,以及TACRED数据集(Sabo等人,2021)的小样本形式。重要的是,所有这些小样本数据集都是在现实假设下生成的,例如:测试关系与模型可能见过的任何关系不同,训练数据有限,且大量候选关系提及不与任何目标关系相对应。利用这一大规模资源,我们对六种近期的小样本关系抽取方法进行了全面评估,并观察到没有一种方法成为明显的赢家。此外,该任务的整体性能较低,表明未来研究仍有大量需求。我们发布了所有版本的数据,即监督版和小样本版,以供未来研究使用。