In recent years, table reasoning has garnered substantial research interest, particularly regarding its integration with Large Language Models (LLMs), which have revolutionized natural language applications. Existing LLM-based studies typically achieve step-by-step thinking for table reasoning guided by task semantics. While these approaches emphasize autonomous exploration and enhance fine-grained table understanding, they often overlook systematic thinking in the reasoning process. This oversight can lead to omitted steps, disorganized logic and misleading results, especially in complex scenarios. In this paper, we propose PoTable, a novel stage-oriented plan-then-execute approach that incorporates systematic thinking into table reasoning. Specifically, PoTable involves several distinct analytical stages with clear objectives to provide adequate guidance. To accomplish stage-specific goals, PoTable employs a plan-then-execute mechanism: it first plans the operation chain based on the stage objective, and then executes operations sequentially through code generation, real-time running and feedback processing. Consequently, PoTable produces reliable table reasoning results with highly accurate, step-wise commented and completely executable programs. It mirrors the workflow of a professional data analyst, offering advantages in both accuracy and explainability. Finally, we conduct extensive experiments on four datasets from the WikiTQ and TabFact benchmarks, where the results demonstrate the effectiveness, efficiency and explainability of PoTable. Our code is available at: https://github.com/Double680/PoTable.
翻译:近年来,表格推理引起了广泛的研究兴趣,特别是其与大型语言模型(LLMs)的集成,后者彻底改变了自然语言应用。现有基于LLM的研究通常通过任务语义引导,逐步实现表格推理的思考。虽然这些方法强调自主探索并增强细粒度的表格理解,但它们往往忽视了推理过程中的系统性思考。这种忽视可能导致步骤遗漏、逻辑混乱及误导性结果,尤其在复杂场景中尤为突出。本文提出了PoTable,一种新颖的面向阶段、先规划后执行的方法,将系统性思考融入表格推理中。具体而言,PoTable包含多个具有明确目标的独立分析阶段,以提供充分的指导。为实现各阶段特定目标,PoTable采用先规划后执行的机制:首先根据阶段目标规划操作链,然后通过代码生成、实时运行和反馈处理依次执行操作。因此,PoTable能够生成可靠且高精度的表格推理结果,并附带逐步注释及完全可执行的程序。它模拟了专业数据分析师的工作流程,在准确性和可解释性方面均具有优势。最后,我们在WikiTQ和TabFact基准测试的四个数据集上进行了广泛实验,结果证明了PoTable的有效性、高效性和可解释性。我们的代码已开源:https://github.com/Double680/PoTable。