Claim-Dissector: An Interpretable Fact-Checking System with Joint Re-ranking and Veracity Prediction

We present Claim-Dissector: a novel latent variable model for fact-checking and analysis, which given a claim and a set of retrieved evidences jointly learns to identify: (i) the relevant evidences to the given claim, (ii) the veracity of the claim. We propose to disentangle the per-evidence relevance probability and its contribution to the final veracity probability in an interpretable way -- the final veracity probability is proportional to a linear ensemble of per-evidence relevance probabilities. In this way, the individual contributions of evidences towards the final predicted probability can be identified. In per-evidence relevance probability, our model can further distinguish whether each relevant evidence is supporting (S) or refuting (R) the claim. This allows to quantify how much the S/R probability contributes to the final verdict or to detect disagreeing evidence. Despite its interpretable nature, our system achieves results competitive with state-of-the-art on the FEVER dataset, as compared to typical two-stage system pipelines, while using significantly fewer parameters. It also sets new state-of-the-art on FAVIQ and RealFC datasets. Furthermore, our analysis shows that our model can learn fine-grained relevance cues while using coarse-grained supervision, and we demonstrate it in 2 ways. (i) We show that our model can achieve competitive sentence recall while using only paragraph-level relevance supervision. (ii) Traversing towards the finest granularity of relevance, we show that our model is capable of identifying relevance at the token level. To do this, we present a new benchmark TLR-FEVER focusing on token-level interpretability -- humans annotate tokens in relevant evidences they considered essential when making their judgment. Then we measure how similar are these annotations to the tokens our model is focusing on.

翻译：我们提出Claim-Dissector：一种用于事实核查与分析的新型潜变量模型。该模型在给定声明及一组检索证据的条件下，联合学习识别：(i)与给定声明相关的证据，(ii)声明的真实性。我们提议以可解释的方式分离每条证据的相关性概率及其对最终真实性概率的贡献——最终真实性概率与各证据相关性概率的线性集成成正比。由此，各证据对最终预测概率的个体贡献可被识别。在每条证据的相关性概率中，我们的模型能进一步区分每条相关证据是支持（S）还是反驳（R）该声明。这允许量化S/R概率对最终裁决的贡献程度，或检测出相矛盾的证据。尽管具有可解释性，我们的系统在FEVER数据集上取得了与典型两阶段系统流程相竞争的结果，且使用的参数显著更少。该系统还在FAVIQ和RealFC数据集上创下新的最佳性能。此外，我们的分析表明，该模型能利用粗粒度监督学习细粒度相关性线索，并通过两种方式验证：(i)我们证明，在仅使用段落级相关性监督时，模型能达到具有竞争力的句子召回率。(ii)向最细粒度的相关性层级深入，我们展示模型能识别令牌级相关性。为此，我们提出专注于令牌级可解释性的新基准TLR-FEVER——人工标注者标记他们认为做出判断时关键的相关证据中的令牌。随后我们测量这些标注与模型关注的令牌之间的相似度。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日