The superior performance of supervised classification methods in the information extraction (IE) area heavily relies on a large amount of gold standard data. Recent zero-shot classification methods converted the task to other NLP tasks (e.g., textual entailment) and used off-the-shelf models of these NLP tasks to directly perform inference on the test data without using a large amount of IE annotation data. A potentially valuable by-product of these methods is the large-scale silver standard data, i.e., pseudo-labeled data by the off-the-shelf models of other NLP tasks. However, there is no further investigation into the use of these data. In this paper, we propose a new framework, Clean-LaVe, which aims to utilize silver standard data to enhance the zero-shot performance. Clean-LaVe includes four phases: (1) Obtaining silver data; (2) Identifying relatively clean data from silver data; (3) Finetuning the off-the-shelf model using clean data; (4) Inference on the test data. The experimental results show that Clean-LaVe can outperform the baseline by 5% and 6% on TACRED and Wiki80 dataset in the zero-shot relation classification task, and by 3%-7% on Smile (Korean and Polish) in the zero-shot cross-lingual relation classification task, and by 8% on ACE05-E+ in the zero-shot event argument classification task. The code is share in https://github.com/wjw136/Clean_LaVe.git.
翻译:信息抽取(IE)领域中有监督分类方法的卓越性能在很大程度上依赖于大量金标准数据。近年来的零样本分类方法将该任务转换为其他自然语言处理(NLP)任务(例如文本蕴含),并利用这些NLP任务的现成模型直接对测试数据进行推理,而无需使用大量IE标注数据。这些方法的一个潜在有价值的副产品是大规模银标准数据,即由其他NLP任务的现成模型生成的伪标签数据。然而,目前对这些数据的使用尚未有进一步的研究。在本文中,我们提出了一个新框架——Clean-LaVe,旨在利用银标准数据来增强零样本性能。Clean-LaVe包括四个阶段:(1)获取银数据;(2)从银数据中识别相对干净的数据;(3)使用干净数据微调现成模型;(4)对测试数据进行推理。实验结果表明,在零样本关系分类任务中,Clean-LaVe在TACRED和Wiki80数据集上分别比基线提高了5%和6%;在零样本跨语言关系分类任务中,在Smile(韩语和波兰语)上提高了3%-7%;在零样本事件论元分类任务中,在ACE05-E+上提高了8%。代码已分享在https://github.com/wjw136/Clean_LaVe.git。