Relation extraction models trained on a source domain cannot be applied on a different target domain due to the mismatch between relation sets. In the current literature, there is no extensive open-source relation extraction dataset specific to the finance domain. In this paper, we release FinRED, a relation extraction dataset curated from financial news and earning call transcripts containing relations from the finance domain. FinRED has been created by mapping Wikidata triplets using distance supervision method. We manually annotate the test data to ensure proper evaluation. We also experiment with various state-of-the-art relation extraction models on this dataset to create the benchmark. We see a significant drop in their performance on FinRED compared to the general relation extraction datasets which tells that we need better models for financial relation extraction.
翻译:由于关系集合的差异,在源领域训练的关系抽取模型无法应用于不同的目标领域。现有文献中尚无专门针对金融领域的大规模开源关系抽取数据集。本文发布了FinRED——一个从金融新闻与财报电话会议记录中构建的金融领域关系抽取数据集。FinRED采用远程监督方法映射维基数据三元组生成。我们对测试数据进行人工标注以确保评估的可靠性,并在该数据集上实验多种先进的关系抽取模型以建立基准。实验结果表明,与通用关系抽取数据集相比,这些模型在FinRED上的性能显著下降,这提示我们需要研发更优的金融领域关系抽取模型。