Claim verification is an important problem in high-stakes settings, including health and finance. When information underpinning claims is incomplete or conflicting, uncertain answers may be more appropriate than binary true or false classifications. In all cases, faithful explanations of the considerations determining the final verdict are crucial. We introduce inference-time argumentation (ITA), a trainable neurosymbolic framework for ternary claim verification in which a formal argumentation semantics giving the strength of claims is used both (i) to guide LLM training as models learn to generate arguments and assign them base scores (representing intrinsic strengths) and (ii) to compute ternary (true/false/uncertain) predictions from generated, scored arguments. As a result, at training time, argument generation and scoring can be optimised according to the quality of the induced argumentative predictions. Moreover, at inference time, the final prediction is faithful, by construction, to the arguments and scores determining the verdict, rather than being justified by a potentially unfaithful post-hoc reasoning trace as in conventional reasoning models. We finally show that, on two datasets for ternary claim verification, ITA improves upon argumentative baselines and can perform competitively against non-argumentative direct-prediction baselines, while providing verdicts that are computed deterministically from explicit, inspectable argumentative structures.
翻译:声明验证是健康与金融等高风险场景中的重要问题。当支撑声明的信息存在缺失或矛盾时,不确定答案可能比二元真/假分类更合适。在所有情况下,对决定最终判决的考量因素提供忠实解释至关重要。我们提出推理时论证(ITA)——一种用于三元声明验证的可训练神经符号框架,其中形式化论证语义给出声明的强度,用于:(i)引导大语言模型训练,使模型学习生成论证并分配基础分数(表示内在强度),以及(ii)从生成的带评分论证中计算三元(真/假/不确定)预测。因此,在训练阶段,论证生成与评分可根据归纳论证预测的质量进行优化。此外,在推理阶段,最终预测在结构上必然忠实于决定判决的论证与分数,而非如传统推理模型那样依赖可能不忠实的后验推理轨迹进行辩解。我们最终证明,在两个三元声明验证数据集上,ITA优于基于论证的基线方法,并能与非论证直接预测基线方法展开竞争,同时提供从显式、可检查的论证结构中确定性计算得出的判决结果。