DialFact: A Benchmark for Fact-Checking in Dialogue - 专知论文

会员服务 ·

0

任务对话系统 · Performer · 维基百科 · SimPLe · Pair ·

2021 年 10 月 15 日

DialFact: A Benchmark for Fact-Checking in Dialogue

翻译：Dialact:对话中实况调查基准

Prakhar Gupta,Chien-Sheng Wu,Wenhao Liu,Caiming Xiong

Fact-checking is an essential tool to mitigate the spread of misinformation and disinformation, however, it has been often explored to verify formal single-sentence claims instead of casual conversational claims. To study the problem, we introduce the task of fact-checking in dialogue. We construct DialFact, a testing benchmark dataset of 22,245 annotated conversational claims, paired with pieces of evidence from Wikipedia. There are three sub-tasks in DialFact: 1) Verifiable claim detection task distinguishes whether a response carries verifiable factual information; 2) Evidence retrieval task retrieves the most relevant Wikipedia snippets as evidence; 3) Claim verification task predicts a dialogue response to be supported, refuted, or not enough information. We found that existing fact-checking models trained on non-dialogue data like FEVER fail to perform well on our task, and thus, we propose a simple yet data-efficient solution to effectively improve fact-checking performance in dialogue. We point out unique challenges in DialFact such as handling the colloquialisms, coreferences, and retrieval ambiguities in the error analysis to shed light on future research in this direction.

翻译：事实检查是减少错误和虚假信息扩散的基本工具,然而,经常探讨如何用事实检查来核实正式的单一判决要求,而不是泛泛的谈话要求。为了研究这一问题,我们提出在对话中进行事实检查的任务。我们建立了DialFact,这是22 245个附加说明的谈话要求的测试基准数据集,与维基百科提供的证据相配。在DialFact中有三个次级任务:1)可核实的索赔检验任务区分一项答复是否包含可核查的事实信息;2)证据检索任务检索最相关的维基百科片作为证据;3)索赔核查任务预测对话反应将得到支持、反驳或不充分的信息。我们发现,现有关于Fever等非对话数据培训的事实核查模型未能很好地完成我们的任务,因此我们提出了一个简单但有效的解决方案,以有效改进对话中的事实检查业绩。我们指出, DialFact在处理学术辩论、参照和检索错误分析中的模糊性以说明未来研究方向。

1

相关内容

任务对话系统

任务对话系统

NeurIPS 20201接收论文列表发布，2334篇论文都在这了！

NeurIPS 20201接收论文列表发布，2334篇论文都在这了！

专知会员服务

38+阅读 · 2021年11月4日

【Coling-2020】面向机器阅读理解的双向认知思维网络

【Coling-2020】面向机器阅读理解的双向认知思维网络

专知会员服务

10+阅读 · 2021年2月12日

最新《会话机器理解:文献综述论文》，15页pdf，Conversational Machine Comprehension

最新《会话机器理解:文献综述论文》，15页pdf，Conversational Machine Comprehension

专知会员服务

13+阅读 · 2020年11月7日

【SIGMOD2020】基于本体的知识库对话系统

专知会员服务

38+阅读 · 2020年9月25日

【ACL2020】多模态信息抽取，365页ppt

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【清华大学-微软研究院】构建智能开放域对话系统的挑战综述论文，31页pdf，Challenges in Building Intelligent Open-domain Dialog Systems

【清华大学-微软研究院】构建智能开放域对话系统的挑战综述论文，31页pdf，Challenges in Building Intelligent Open-domain Dialog Systems

专知会员服务

29+阅读 · 2019年11月2日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

计算机 | IUI 2020等国际会议信息4条

计算机 | IUI 2020等国际会议信息4条

Call4Papers

6+阅读 · 2019年6月17日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

动物脑的好奇心和强化学习的好奇心

动物脑的好奇心和强化学习的好奇心

CreateAMind

10+阅读 · 2019年1月26日

【200+论文】深度强化学习、对话系统、文本生成、文本摘要、阅读理解等文献列表

【200+论文】深度强化学习、对话系统、文本生成、文本摘要、阅读理解等文献列表

专知

16+阅读 · 2019年1月14日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

计算机类 | ISCC 2019等国际会议信息9条

计算机类 | ISCC 2019等国际会议信息9条

Call4Papers

5+阅读 · 2018年12月25日

已删除

将门创投

5+阅读 · 2018年6月7日

Open-Domain, Content-based, Multi-modal Fact-checking of Out-of-Context Images via Online Resources

Arxiv

0+阅读 · 2021年12月7日

SHD360: A Benchmark Dataset for Salient Human Detection in 360° Videos

Arxiv

0+阅读 · 2021年12月7日

Question Answering Survey: Directions, Challenges, Datasets, Evaluation Matrices

Arxiv

0+阅读 · 2021年12月7日

Object Detection in Aerial Images: A Large-Scale Benchmark and Challenges

Arxiv

0+阅读 · 2021年12月4日

A Survey on Automated Fact-Checking

A Survey on Automated Fact-Checking

Arxiv

8+阅读 · 2021年8月26日

Advances and Challenges in Conversational Recommender Systems: A Survey

Arxiv

14+阅读 · 2021年1月23日

CoDEx: A Comprehensive Knowledge Graph Completion Benchmark

Arxiv

10+阅读 · 2020年10月6日

Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems

Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems

Arxiv

11+阅读 · 2019年11月4日

CoQA: A Conversational Question Answering Challenge

CoQA: A Conversational Question Answering Challenge

Arxiv

7+阅读 · 2018年8月21日

TrackingNet: A Large-Scale Dataset and Benchmark for Object Tracking in the Wild

Arxiv

7+阅读 · 2018年3月28日

VIP会员

文章信息

相关主题

任务对话系统

最新内容

ECCV 2026 | MIMFlow：MIM与归一化流统一图像生成

ECCV 2026 | MIMFlow：MIM与归一化流统一图像生成

专知会员服务

6+阅读 · 6月25日

超越自回归边界：扩散模型、世界模型与SSM如何重塑代码智能

超越自回归边界：扩散模型、世界模型与SSM如何重塑代码智能

专知会员服务

5+阅读 · 6月25日

重塑决策优势：美军作战艺术与多域作战中联盟联合全域指挥控制（CJADC2）体系的融合

重塑决策优势：美军作战艺术与多域作战中联盟联合全域指挥控制（CJADC2）体系的融合

专知会员服务

7+阅读 · 6月25日

网状网络及其在军事领域的运用

网状网络及其在军事领域的运用

专知会员服务

7+阅读 · 6月25日

《意识即战场——全球安全体系中认知战的演进：乌克兰构建认知作战体系的展望》

《意识即战场——全球安全体系中认知战的演进：乌克兰构建认知作战体系的展望》

专知会员服务

7+阅读 · 6月25日

无美国参与的欧洲战争方式（万字长文）

无美国参与的欧洲战争方式（万字长文）

专知会员服务

8+阅读 · 6月25日

重构“下一场战争”的制胜理论：超越兰彻斯特方程与现代系统

重构“下一场战争”的制胜理论：超越兰彻斯特方程与现代系统

专知会员服务

9+阅读 · 6月25日

《国防工业中基于模型定义的实施：产品定义数字化转型的战略路径》90页

《国防工业中基于模型定义的实施：产品定义数字化转型的战略路径》90页

专知会员服务

9+阅读 · 6月25日

《国防领域敏感性分析白皮书》

《国防领域敏感性分析白皮书》

专知会员服务

8+阅读 · 6月25日

综述 | 从问答到任务完成：Agent系统与Harness设计

综述 | 从问答到任务完成：Agent系统与Harness设计

专知会员服务

9+阅读 · 6月24日

Agentic RL：框架、实践与长程智能体训练

Agentic RL：框架、实践与长程智能体训练

专知会员服务

10+阅读 · 6月24日

反无人机拦截器训练与运用课程：对美国陆军部队发展的启示

反无人机拦截器训练与运用课程：对美国陆军部队发展的启示

专知会员服务

11+阅读 · 6月24日

重新思考无人机时代的生存能力

重新思考无人机时代的生存能力

专知会员服务

10+阅读 · 6月24日

装甲突击旅：现代战争思考、战斗与组织

装甲突击旅：现代战争思考、战斗与组织

专知会员服务

7+阅读 · 6月24日

在人工智能加速决策环境中拓展OODA循环

在人工智能加速决策环境中拓展OODA循环

专知会员服务

10+阅读 · 6月24日

相关VIP内容

NeurIPS 20201接收论文列表发布，2334篇论文都在这了！

NeurIPS 20201接收论文列表发布，2334篇论文都在这了！

专知会员服务

38+阅读 · 2021年11月4日

【Coling-2020】面向机器阅读理解的双向认知思维网络

【Coling-2020】面向机器阅读理解的双向认知思维网络

专知会员服务

10+阅读 · 2021年2月12日

最新《会话机器理解:文献综述论文》，15页pdf，Conversational Machine Comprehension

最新《会话机器理解:文献综述论文》，15页pdf，Conversational Machine Comprehension

专知会员服务

13+阅读 · 2020年11月7日

【SIGMOD2020】基于本体的知识库对话系统

专知会员服务

38+阅读 · 2020年9月25日

【ACL2020】多模态信息抽取，365页ppt

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【清华大学-微软研究院】构建智能开放域对话系统的挑战综述论文，31页pdf，Challenges in Building Intelligent Open-domain Dialog Systems

【清华大学-微软研究院】构建智能开放域对话系统的挑战综述论文，31页pdf，Challenges in Building Intelligent Open-domain Dialog Systems

专知会员服务

29+阅读 · 2019年11月2日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

超越自回归边界：扩散模型、世界模型与SSM如何重塑代码智能

网状网络及其在军事领域的运用

ECCV 2026 | MIMFlow：MIM与归一化流统一图像生成

重塑决策优势：美军作战艺术与多域作战中联盟联合全域指挥控制（CJADC2）体系的融合

相关资讯

计算机 | IUI 2020等国际会议信息4条

计算机 | IUI 2020等国际会议信息4条

Call4Papers

6+阅读 · 2019年6月17日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

动物脑的好奇心和强化学习的好奇心

动物脑的好奇心和强化学习的好奇心

CreateAMind

10+阅读 · 2019年1月26日

【200+论文】深度强化学习、对话系统、文本生成、文本摘要、阅读理解等文献列表

【200+论文】深度强化学习、对话系统、文本生成、文本摘要、阅读理解等文献列表

专知

16+阅读 · 2019年1月14日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

计算机类 | ISCC 2019等国际会议信息9条

计算机类 | ISCC 2019等国际会议信息9条

Call4Papers

5+阅读 · 2018年12月25日

已删除

将门创投

5+阅读 · 2018年6月7日

相关论文

Open-Domain, Content-based, Multi-modal Fact-checking of Out-of-Context Images via Online Resources

Arxiv

0+阅读 · 2021年12月7日

SHD360: A Benchmark Dataset for Salient Human Detection in 360° Videos

Arxiv

0+阅读 · 2021年12月7日

Question Answering Survey: Directions, Challenges, Datasets, Evaluation Matrices

Arxiv

0+阅读 · 2021年12月7日

Object Detection in Aerial Images: A Large-Scale Benchmark and Challenges

Arxiv

0+阅读 · 2021年12月4日

A Survey on Automated Fact-Checking

A Survey on Automated Fact-Checking

Arxiv

8+阅读 · 2021年8月26日

Advances and Challenges in Conversational Recommender Systems: A Survey

Arxiv

14+阅读 · 2021年1月23日

CoDEx: A Comprehensive Knowledge Graph Completion Benchmark

Arxiv

10+阅读 · 2020年10月6日

Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems

Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems

Arxiv

11+阅读 · 2019年11月4日

CoQA: A Conversational Question Answering Challenge

CoQA: A Conversational Question Answering Challenge

Arxiv

7+阅读 · 2018年8月21日

TrackingNet: A Large-Scale Dataset and Benchmark for Object Tracking in the Wild

Arxiv

7+阅读 · 2018年3月28日

微信扫码咨询专知VIP会员