We present a novel approach for claim verification from tabular data documents. Recent LLM-based approaches either employ complex pretraining/fine-tuning or decompose verification into subtasks, often lacking comprehensive explanations and generalizability. To address these limitations, we propose a Multi-Agentic framework for Claim verification (MACE) consisting of three specialized agents: Planner, Executor, and Verifier. Instead of elaborate finetuning, each agent employs a zero-shot Chain-of-Thought setup to perform its tasks. MACE produces interpretable verification traces, with the Planner generating explicit reasoning strategies, the Executor providing detailed computation steps, and the Verifier validating the logic. Experiments demonstrate that MACE achieves state-of-the-art (SOTA) performance on two datasets and performs on par with the best models on two others, while achieving 80--100\% of best performance with substantially smaller models: 27--92B parameters versus 235B. This combination of competitive performance, memory efficiency, and transparent reasoning highlights our framework's effectiveness.
翻译:我们提出了一种新颖的基于表格数据文档的声明验证方法。近期基于大语言模型的方法要么采用复杂的预训练/微调机制,要么将验证过程分解为子任务,但往往缺乏全面的解释性和泛化能力。为克服这些局限,我们提出了一个面向声明验证的多智能体框架(MACE),包含三个专用智能体:规划器、执行器和验证器。该方法无需复杂的精调,每个智能体采用零样本思维链设置执行其任务。MACE能够生成可解释的验证轨迹:规划器生成显式推理策略,执行器提供详细计算步骤,验证器验证逻辑正确性。实验表明,MACE在两个数据集上达到最优性能,在其他两个数据集上与最佳模型性能持平,同时使用27-92B参数规模的较小模型(相较于235B参数模型)即可实现最优性能的80-100%。这种竞争性性能、内存效率与透明推理能力的结合,充分展现了我们框架的有效性。