VeriGraph: Towards Verifiable Data-Analytic Agents

LLM-based agents have demonstrated strong capabilities in data-intensive analytical tasks, yet their outputs are rarely verifiable: a reliance on linear text trajectories makes their reasoning difficult to audit. In particular, deterministic computations over raw data and semantic deductions over natural-language claims are often entangled in an unstructured stream, leaving numerical conclusions hard to reproduce and qualitative judgments hard to inspect. To address this, we propose VeriGraph, a traceable neuro-symbolic reasoning framework that enables agents to construct an explicit heterogeneous evidence directed acyclic graph (DAG) during execution. VeriGraph introduces three evidence-expansion primitives, namely computational, grounding, and derivational expansion, to connect raw data, interpreter variables, computed results, and natural-language claims in a unified graph. Under this formulation, structural traceability is reduced to graph reachability from raw data sources to terminal claims, while semantic support is measured by claim-level evidence evaluation. To improve graph construction, we further design a graph-based policy optimization strategy with a composite reward that jointly supervises answer correctness, computational integrity, and derivational coherence. Experiments on four benchmarks show that VeriGraph-8B achieves the highest overall score among all baselines. More importantly, VeriGraph produces auditable evidence graphs with substantially stronger claim grounding, achieving a 87.61\% Grounding Rate under our claim-level evidence support evaluation. These results suggest that explicit evidence-graph construction is a promising path toward verifiable data-analytic agents. Our code is available at https://github.com/ignorejjj/VeriGraph.

翻译：基于大语言模型的代理在数据密集型分析任务中展现出强大能力，但其输出结果很少是可验证的：依赖于线性文本轨迹使得其推理过程难以审计。特别地，对原始数据的确定性计算与对自然语言陈述的语义推演常常混杂在非结构化流程中，导致数值结论难以复现，定性判断难以审查。为解决这一问题，我们提出VeriGraph——一种可追踪的神经符号推理框架，使代理能够在执行过程中构建显式的异构证据有向无环图。VeriGraph引入了三种证据扩展原语，即计算扩展、基础扩展和派生扩展，将原始数据、解释器变量、计算结果和自然语言陈述统一到一个图中。在此框架下，结构可追溯性被简化为从原始数据源到终端节点的图可达性问题，而语义支持度则通过陈述级证据评估来衡量。为改进图构建，我们进一步设计了基于图的策略优化策略，采用复合奖励函数联合监督答案正确性、计算完整性和派生连贯性。在四个基准测试上的实验表明，VeriGraph-8B在所有基线方法中获得最高总分。更重要的是，VeriGraph生成的可审计证据图具有显著更强的陈述基础能力，在我们的陈述级证据支持评估下达到了87.61%的基础率。这些结果表明，显式证据图构建是实现可验证数据分析代理的有前景路径。我们的代码已开源在https://github.com/ignorejjj/VeriGraph。