In this paper, we explore zero- and few-shot generalization for fact verification (FV), which aims to generalize the FV model trained on well-resourced domains (e.g., Wikipedia) to low-resourced domains that lack human annotations. To this end, we first construct a benchmark dataset collection which contains 11 FV datasets representing 6 domains. We conduct an empirical analysis of generalization across these FV datasets, finding that current models generalize poorly. Our analysis reveals that several factors affect generalization, including dataset size, length of evidence, and the type of claims. Finally, we show that two directions of work improve generalization: 1) incorporating domain knowledge via pretraining on specialized domains, and 2) automatically generating training data via claim generation.
翻译:本文探究事实验证(FV)中的零样本与少样本泛化能力,旨在将基于资源丰富领域(如维基百科)训练的FV模型推广至缺乏人工标注的低资源领域。为此,我们首先构建了一个包含11个FV数据集的基准集合,这些数据集覆盖6个领域。通过跨数据集实证分析,我们发现当前模型的泛化效果较差。分析显示,数据集规模、证据长度及声明类型等多重因素影响泛化性能。最后,我们证明两种改进方向可提升泛化能力:1)通过对专业领域进行预训练引入领域知识;2)通过声明生成技术自动构建训练数据。