In recent years, table reasoning has garnered substantial research interest, particularly regarding its integration with Large Language Models (LLMs), which have revolutionized natural language applications. Existing LLM-based studies typically achieve step-by-step thinking for table reasoning guided by task semantics. While these approaches emphasize autonomous exploration and enhance fine-grained table understanding, they often overlook systematic thinking in the reasoning process. This oversight can lead to omitted steps, disorganized logic and misleading results, especially in complex scenarios. In this paper, we propose PoTable, a novel stage-oriented plan-then-execute approach that incorporates systematic thinking into table reasoning. Specifically, PoTable involves several distinct analytical stages with clear objectives to provide adequate guidance. To accomplish stage-specific goals, PoTable employs a plan-then-execute mechanism: it first plans the operation chain based on the stage objective, and then executes operations sequentially through code generation, real-time running and feedback processing. Consequently, PoTable produces reliable table reasoning results with highly accurate, step-wise commented and completely executable programs. It mirrors the workflow of a professional data analyst, offering advantages in both accuracy and explainability. Finally, we conduct extensive experiments on four datasets from the WikiTQ and TabFact benchmarks, where the results demonstrate the effectiveness, efficiency and explainability of PoTable. Our code is available at: https://github.com/Double680/PoTable.
翻译:近年来,表格推理研究引起了广泛关注,特别是其与大型语言模型(LLMs)的结合,后者已彻底改变了自然语言应用领域。现有的基于LLM的研究通常通过任务语义引导,实现表格推理的逐步思考。尽管这些方法强调自主探索并增强了细粒度表格理解,但它们往往忽视了推理过程中的系统性思维。这种疏忽可能导致步骤遗漏、逻辑混乱和误导性结果,尤其在复杂场景中。本文提出PoTable,一种新颖的、面向阶段的“规划-执行”方法,将系统性思维融入表格推理。具体而言,PoTable包含多个具有明确目标的独立分析阶段,以提供充分指导。为实现各阶段特定目标,PoTable采用“规划-执行”机制:首先基于阶段目标规划操作链,随后通过代码生成、实时运行和反馈处理依次执行操作。因此,PoTable能够生成可靠的表格推理结果,其程序具有高准确性、逐步注释和完全可执行的特点。该方法模拟了专业数据分析师的工作流程,在准确性和可解释性方面均具优势。最后,我们在来自WikiTQ和TabFact基准的四个数据集上进行了大量实验,结果证明了PoTable的有效性、高效性和可解释性。我们的代码发布于:https://github.com/Double680/PoTable。