Case-Grounded Evidence Verification: A Framework for Constructing Evidence-Sensitive Supervision - 专知论文

会员服务 ·

0

Case-Grounded Evidence Verification: A Framework for Constructing Evidence-Sensitive Supervision

翻译：案例锚定证据验证：一种构建证据敏感监督的框架

Soroosh Tayebi Arasteh,Mehdi Joodaki,Mahshad Lotfinia,Sven Nebelung,Daniel Truhn

Evidence-grounded reasoning requires more than attaching retrieved text to a prediction: a model should make decisions that depend on whether the provided evidence supports the target claim. In practice, this often fails because supervision is weak, evidence is only loosely tied to the claim, and evaluation does not test evidence dependence directly. We introduce case-grounded evidence verification, a general framework in which a model receives a local case context, external evidence, and a structured claim, and must decide whether the evidence supports the claim for that case. Our key contribution is a supervision construction procedure that generates explicit support examples together with semantically controlled non-support examples, including counterfactual wrong-state and topic-related negatives, without manual evidence annotation. We instantiate the framework in radiology and train a standard verifier on the resulting support task. The learned verifier substantially outperforms both case-only and evidence-only baselines, remains strong under correct evidence, and collapses when evidence is removed or swapped, indicating genuine evidence dependence. This behavior transfers across unseen evidence articles and an external case distribution, though performance degrades under evidence-source shift and remains sensitive to backbone choice. Overall, the results suggest that a major bottleneck in evidence grounding is not only model capacity, but the lack of supervision that encodes the causal role of evidence.

翻译：基于证据的推理不仅需要将检索到的文本附加到预测结果上：模型应做出依赖于所提供证据是否支持目标主张的决策。在实践中，这一过程常因监督信号薄弱、证据与主张关联松散以及评估未能直接测试证据依赖性而失败。我们提出案例锚定证据验证这一通用框架——模型接收局部案例上下文、外部证据及结构化主张，需判断该案例中证据是否支持该主张。我们的核心贡献在于构建了一种监督生成流程，可自动生成显式支持样本与语义可控的非支持样本（包括反事实错误状态及主题相关负样本），无需人工证据标注。我们在放射学领域实例化该框架，基于生成的验证任务训练标准验证器。训练后的验证器显著优于纯案例基线和纯证据基线，在正确证据下保持强性能，当移除或替换证据时性能骤降，表明其具备真实的证据依赖性。该行为可迁移至未见过的证据文章及外部案例分布，但证据源偏移时性能下降，且仍受主干网络选择影响。总体而言，结果表明证据锚定推理的主要瓶颈不仅在于模型容量，更在于缺乏编码证据因果作用的监督机制。

0

相关内容

【EPFL博士论文】因果推断的方法学进展：实验、识别与估计

【EPFL博士论文】因果推断的方法学进展：实验、识别与估计

专知会员服务

16+阅读 · 2025年11月5日

ChatGPT背后“推理”如何做？浙大等最新《基于语言模型提示的推理》综述，阐述大模型提示推理机制与方法体系

ChatGPT背后“推理”如何做？浙大等最新《基于语言模型提示的推理》综述，阐述大模型提示推理机制与方法体系

专知会员服务

112+阅读 · 2023年5月6日

《考虑广义证据理论不完备识别框架的空战态势评估》2022西工大论文【Scientific Reports】

《考虑广义证据理论不完备识别框架的空战态势评估》2022西工大论文【Scientific Reports】

专知会员服务

65+阅读 · 2023年1月3日

EMNLP 2021 | 基于证据检索和图神经验证网络的表格事实验证模型

EMNLP 2021 | 基于证据检索和图神经验证网络的表格事实验证模型

专知会员服务

20+阅读 · 2021年12月12日

证据推理理论及其应用

专知会员服务

49+阅读 · 2021年5月24日

因果关联学习，Causal Relational Learning

因果关联学习，Causal Relational Learning

专知会员服务

185+阅读 · 2020年4月21日

【ACL2020】生成事实验证解释，Generating Fact Checking Explanations

【ACL2020】生成事实验证解释，Generating Fact Checking Explanations

专知会员服务

17+阅读 · 2020年4月15日

最新「因果推断Causal Inference」综述论文38页pdf，Buffalo、Georgia、阿里巴巴、Virginia

专知会员服务

183+阅读 · 2020年2月11日

【论文】用于推理的概率逻辑神经网络（Probabilistic Logic Neural Networks for Reasoning）

【论文】用于推理的概率逻辑神经网络（Probabilistic Logic Neural Networks for Reasoning）

专知会员服务

104+阅读 · 2019年12月30日

【自监督学习新成果】基于对比预测编码的数据高效图像识别（Data-Efficient Image Recognition with Contrastive Predictive Coding）

【自监督学习新成果】基于对比预测编码的数据高效图像识别（Data-Efficient Image Recognition with Contrastive Predictive Coding）

专知会员服务

16+阅读 · 2019年12月10日

「知识增强预训练语言模型」最新研究综述

「知识增强预训练语言模型」最新研究综述

专知

18+阅读 · 2022年11月18日

「因果推理」概述论文，13页pdf

「因果推理」概述论文，13页pdf

专知

16+阅读 · 2021年3月20日

【KDD2020-Tutorial】因果推理与稳定学习，Causal Inference and Stable Learning

【KDD2020-Tutorial】因果推理与稳定学习，Causal Inference and Stable Learning

专知

11+阅读 · 2020年8月28日

基于深度元学习的因果推断新方法

基于深度元学习的因果推断新方法

图与推荐

12+阅读 · 2020年7月21日

【WWW2020-新加坡国立大学】知识图谱强化负采样的推荐系统，Reinforced Negative Sampling

【WWW2020-新加坡国立大学】知识图谱强化负采样的推荐系统，Reinforced Negative Sampling

专知

22+阅读 · 2020年3月14日

最新「因果推断Causal Inference」综述论文38页pdf，阿里巴巴、Buffalo、Georgia、Virginia

最新「因果推断Causal Inference」综述论文38页pdf，阿里巴巴、Buffalo、Georgia、Virginia

专知

68+阅读 · 2020年2月11日

因果推理学习算法资源大列表

因果推理学习算法资源大列表

专知

27+阅读 · 2019年3月3日

论文浅尝 | 基于深度强化学习的远程监督数据集的降噪

论文浅尝 | 基于深度强化学习的远程监督数据集的降噪

开放知识图谱

29+阅读 · 2019年1月17日

【论文推荐】最新6篇目标检测相关论文—场景文本检测、显著对象、语义知识转移、混合监督目标检测、域自适应、车牌识别

【论文推荐】最新6篇目标检测相关论文—场景文本检测、显著对象、语义知识转移、混合监督目标检测、域自适应、车牌识别

专知

19+阅读 · 2018年3月16日

关系推理：基于表示学习和语义要素

关系推理：基于表示学习和语义要素

计算机研究与发展

19+阅读 · 2017年8月22日

融合事件关系推理和情感博弈的网络不实信息演化机理研究

国家自然科学基金

2+阅读 · 2015年12月31日

高维回归模型的预测稳定性研究

国家自然科学基金

3+阅读 · 2015年12月31日

非确定型Web服务流程重组的可靠性验证技术

国家自然科学基金

1+阅读 · 2015年12月31日

网络安全威胁踪源分析方法研究

国家自然科学基金

19+阅读 · 2015年12月31日

面向DS证据理论的关联信息融合研究

国家自然科学基金

4+阅读 · 2015年12月31日

大数据环境下的证券市场操纵行为发现机理、模型与方法

国家自然科学基金

0+阅读 · 2015年12月31日

面向微博数据的位置相关事件检测和时空异常聚类模式挖掘研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于案事件流数据的犯罪行为时空分析研究

国家自然科学基金

3+阅读 · 2014年12月31日

不确定性推理与语义网中知识表示的数学基础

国家自然科学基金

18+阅读 · 2012年12月31日

因果推断及不完全数据的统计分析

国家自然科学基金

23+阅读 · 2008年12月31日

A Hybrid Retrieval and Reranking Framework for Evidence-Grounded Retrieval-Augmented Generation

Arxiv

0+阅读 · 5月3日

Generating Verifiable Chain of Thoughts from Exection-Traces

Arxiv

0+阅读 · 4月27日

Prototype-Grounded Concept Models for Verifiable Concept Alignment

Arxiv

0+阅读 · 4月17日

Online Learnability of Chain-of-Thought Verifiers: Soundness and Completeness Trade-offs

Arxiv

0+阅读 · 4月7日

Eligibility-Aware Evidence Synthesis: An Agentic Framework for Clinical Trial Meta-Analysis

Arxiv

0+阅读 · 4月3日

EvidenceRL: Reinforcing Evidence Consistency for Trustworthy Language Models

Arxiv

0+阅读 · 3月20日

Towards Verifiable AI with Lightweight Cryptographic Proofs of Inference

Arxiv

0+阅读 · 3月19日

TRACE: Temporal Rule-Anchored Chain-of-Evidence on Knowledge Graphs for Interpretable Stock Movement Prediction

Arxiv

0+阅读 · 3月12日

Seeing the Reasoning: How LLM Rationales Influence User Trust and Decision-Making in Factual Verification Tasks

Arxiv

0+阅读 · 3月7日

Multi-Sourced, Multi-Agent Evidence Retrieval for Fact-Checking

Arxiv

0+阅读 · 2月27日

VIP会员

文章信息

相关主题

最新内容

《国防领域敏感性分析白皮书》

《国防领域敏感性分析白皮书》

专知会员服务

0+阅读 · 今天3:42

综述 | 从问答到任务完成：Agent系统与Harness设计

综述 | 从问答到任务完成：Agent系统与Harness设计

专知会员服务

3+阅读 · 6月24日

Agentic RL：框架、实践与长程智能体训练

Agentic RL：框架、实践与长程智能体训练

专知会员服务

2+阅读 · 6月24日

反无人机拦截器训练与运用课程：对美国陆军部队发展的启示

反无人机拦截器训练与运用课程：对美国陆军部队发展的启示

专知会员服务

8+阅读 · 6月24日

重新思考无人机时代的生存能力

重新思考无人机时代的生存能力

专知会员服务

6+阅读 · 6月24日

装甲突击旅：现代战争思考、战斗与组织

装甲突击旅：现代战争思考、战斗与组织

专知会员服务

5+阅读 · 6月24日

在人工智能加速决策环境中拓展OODA循环

在人工智能加速决策环境中拓展OODA循环

专知会员服务

6+阅读 · 6月24日

《廉价自杀式无人机战争的军事战略影响：乌克兰与伊朗案例研究》

《廉价自杀式无人机战争的军事战略影响：乌克兰与伊朗案例研究》

专知会员服务

6+阅读 · 6月24日

军事欺骗：供作战战术指挥官使用的工具

军事欺骗：供作战战术指挥官使用的工具

专知会员服务

5+阅读 · 6月24日

ICML 2026 | CFPO：用反事实策略优化提升多模态推理

ICML 2026 | CFPO：用反事实策略优化提升多模态推理

专知会员服务

4+阅读 · 6月23日

综述 | 世界动作模型：少做梦，多行动

综述 | 世界动作模型：少做梦，多行动

专知会员服务

7+阅读 · 6月23日

美以伊冲突：无人机与人工智能的运用

美以伊冲突：无人机与人工智能的运用

专知会员服务

12+阅读 · 6月23日

《战时图神经网络：整合以色列-伊朗冲突中的网络安全与无人机智能》最新50页文献

《战时图神经网络：整合以色列-伊朗冲突中的网络安全与无人机智能》最新50页文献

专知会员服务

6+阅读 · 6月23日

《特种部队在透明战场中的生存力》最新报告

《特种部队在透明战场中的生存力》最新报告

专知会员服务

5+阅读 · 6月23日

《自主无人机蜂群协同与控制系统：人工智能赋能的战场协同与自主任务编排平台》

《自主无人机蜂群协同与控制系统：人工智能赋能的战场协同与自主任务编排平台》

专知会员服务

9+阅读 · 6月23日

相关VIP内容

【EPFL博士论文】因果推断的方法学进展：实验、识别与估计

【EPFL博士论文】因果推断的方法学进展：实验、识别与估计

专知会员服务

16+阅读 · 2025年11月5日

ChatGPT背后“推理”如何做？浙大等最新《基于语言模型提示的推理》综述，阐述大模型提示推理机制与方法体系

ChatGPT背后“推理”如何做？浙大等最新《基于语言模型提示的推理》综述，阐述大模型提示推理机制与方法体系

专知会员服务

112+阅读 · 2023年5月6日

《考虑广义证据理论不完备识别框架的空战态势评估》2022西工大论文【Scientific Reports】

《考虑广义证据理论不完备识别框架的空战态势评估》2022西工大论文【Scientific Reports】

专知会员服务

65+阅读 · 2023年1月3日

EMNLP 2021 | 基于证据检索和图神经验证网络的表格事实验证模型

EMNLP 2021 | 基于证据检索和图神经验证网络的表格事实验证模型

专知会员服务

20+阅读 · 2021年12月12日

证据推理理论及其应用

专知会员服务

49+阅读 · 2021年5月24日

因果关联学习，Causal Relational Learning

因果关联学习，Causal Relational Learning

专知会员服务

185+阅读 · 2020年4月21日

【ACL2020】生成事实验证解释，Generating Fact Checking Explanations

【ACL2020】生成事实验证解释，Generating Fact Checking Explanations

专知会员服务

17+阅读 · 2020年4月15日

最新「因果推断Causal Inference」综述论文38页pdf，Buffalo、Georgia、阿里巴巴、Virginia

专知会员服务

183+阅读 · 2020年2月11日

【论文】用于推理的概率逻辑神经网络（Probabilistic Logic Neural Networks for Reasoning）

【论文】用于推理的概率逻辑神经网络（Probabilistic Logic Neural Networks for Reasoning）

专知会员服务

104+阅读 · 2019年12月30日

【自监督学习新成果】基于对比预测编码的数据高效图像识别（Data-Efficient Image Recognition with Contrastive Predictive Coding）

【自监督学习新成果】基于对比预测编码的数据高效图像识别（Data-Efficient Image Recognition with Contrastive Predictive Coding）

专知会员服务

16+阅读 · 2019年12月10日

热门VIP内容

开通专知VIP会员享更多权益服务

综述 | 从问答到任务完成：Agent系统与Harness设计

反无人机拦截器训练与运用课程：对美国陆军部队发展的启示

《国防领域敏感性分析白皮书》

Agentic RL：框架、实践与长程智能体训练

相关资讯

「知识增强预训练语言模型」最新研究综述

「知识增强预训练语言模型」最新研究综述

专知

18+阅读 · 2022年11月18日

「因果推理」概述论文，13页pdf

「因果推理」概述论文，13页pdf

专知

16+阅读 · 2021年3月20日

【KDD2020-Tutorial】因果推理与稳定学习，Causal Inference and Stable Learning

【KDD2020-Tutorial】因果推理与稳定学习，Causal Inference and Stable Learning

专知

11+阅读 · 2020年8月28日

基于深度元学习的因果推断新方法

基于深度元学习的因果推断新方法

图与推荐

12+阅读 · 2020年7月21日

【WWW2020-新加坡国立大学】知识图谱强化负采样的推荐系统，Reinforced Negative Sampling

【WWW2020-新加坡国立大学】知识图谱强化负采样的推荐系统，Reinforced Negative Sampling

专知

22+阅读 · 2020年3月14日

最新「因果推断Causal Inference」综述论文38页pdf，阿里巴巴、Buffalo、Georgia、Virginia

最新「因果推断Causal Inference」综述论文38页pdf，阿里巴巴、Buffalo、Georgia、Virginia

专知

68+阅读 · 2020年2月11日

因果推理学习算法资源大列表

因果推理学习算法资源大列表

专知

27+阅读 · 2019年3月3日

论文浅尝 | 基于深度强化学习的远程监督数据集的降噪

论文浅尝 | 基于深度强化学习的远程监督数据集的降噪

开放知识图谱

29+阅读 · 2019年1月17日

【论文推荐】最新6篇目标检测相关论文—场景文本检测、显著对象、语义知识转移、混合监督目标检测、域自适应、车牌识别

【论文推荐】最新6篇目标检测相关论文—场景文本检测、显著对象、语义知识转移、混合监督目标检测、域自适应、车牌识别

专知

19+阅读 · 2018年3月16日

关系推理：基于表示学习和语义要素

关系推理：基于表示学习和语义要素

计算机研究与发展

19+阅读 · 2017年8月22日

相关论文

A Hybrid Retrieval and Reranking Framework for Evidence-Grounded Retrieval-Augmented Generation

Arxiv

0+阅读 · 5月3日

Generating Verifiable Chain of Thoughts from Exection-Traces

Arxiv

0+阅读 · 4月27日

Prototype-Grounded Concept Models for Verifiable Concept Alignment

Arxiv

0+阅读 · 4月17日

Online Learnability of Chain-of-Thought Verifiers: Soundness and Completeness Trade-offs

Arxiv

0+阅读 · 4月7日

Eligibility-Aware Evidence Synthesis: An Agentic Framework for Clinical Trial Meta-Analysis

Arxiv

0+阅读 · 4月3日

EvidenceRL: Reinforcing Evidence Consistency for Trustworthy Language Models

Arxiv

0+阅读 · 3月20日

Towards Verifiable AI with Lightweight Cryptographic Proofs of Inference

Arxiv

0+阅读 · 3月19日

TRACE: Temporal Rule-Anchored Chain-of-Evidence on Knowledge Graphs for Interpretable Stock Movement Prediction

Arxiv

0+阅读 · 3月12日

Seeing the Reasoning: How LLM Rationales Influence User Trust and Decision-Making in Factual Verification Tasks

Arxiv

0+阅读 · 3月7日

Multi-Sourced, Multi-Agent Evidence Retrieval for Fact-Checking

Arxiv

0+阅读 · 2月27日

相关基金

融合事件关系推理和情感博弈的网络不实信息演化机理研究

国家自然科学基金

2+阅读 · 2015年12月31日

高维回归模型的预测稳定性研究

国家自然科学基金

3+阅读 · 2015年12月31日

非确定型Web服务流程重组的可靠性验证技术

国家自然科学基金

1+阅读 · 2015年12月31日

网络安全威胁踪源分析方法研究

国家自然科学基金

19+阅读 · 2015年12月31日

面向DS证据理论的关联信息融合研究

国家自然科学基金

4+阅读 · 2015年12月31日

大数据环境下的证券市场操纵行为发现机理、模型与方法

国家自然科学基金

0+阅读 · 2015年12月31日

面向微博数据的位置相关事件检测和时空异常聚类模式挖掘研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于案事件流数据的犯罪行为时空分析研究

国家自然科学基金

3+阅读 · 2014年12月31日

不确定性推理与语义网中知识表示的数学基础

国家自然科学基金

18+阅读 · 2012年12月31日

因果推断及不完全数据的统计分析

国家自然科学基金

23+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员