Table Question Answering (TableQA) benefits significantly from table pruning, which extracts compact sub-tables by eliminating redundant cells to streamline downstream reasoning. However, existing pruning methods typically rely on sequential revisions driven by unreliable critique signals, often failing to detect the loss of answer-critical data. To address this limitation, we propose TabTrim, a novel table pruning framework which transforms table pruning from sequential revisions to gold trajectory-supervised parallel search. TabTrim derives a gold pruning trajectory using the intermediate sub-tables in the execution process of gold SQL queries, and trains a pruner and a verifier to make the step-wise pruning result align with the gold pruning trajectory. During inference, TabTrim performs parallel search to explore multiple candidate pruning trajectories and identify the optimal sub-table. Extensive experiments demonstrate that TabTrim achieves state-of-the-art performance across diverse tabular reasoning tasks: TabTrim-8B reaches 73.5% average accuracy, outperforming the strongest baseline by 3.2%, including 79.4% on WikiTQ and 61.2% on TableBench.
翻译:表格问答(TableQA)通过表剪枝显著受益,该方法通过消除冗余单元格来提取紧凑子表,以简化下游推理过程。然而,现有剪枝方法通常依赖由不可靠的批评信号驱动的顺序修正,往往难以检测到关键答案数据的丢失。为解决这一局限,我们提出TabTrim——一种新颖的表剪枝框架,将表剪枝从顺序修正转变为黄金轨迹监督的并行搜索。TabTrim利用黄金SQL查询执行过程中的中间子表推导出黄金剪枝轨迹,并训练剪枝器与验证器,使逐步剪枝结果与黄金剪枝轨迹对齐。在推理阶段,TabTrim通过并行搜索探索多个候选剪枝轨迹,以识别最优子表。大量实验表明,TabTrim在各类表格推理任务中均取得最优性能:TabTrim-8B达到73.5%的平均准确率,超过最强基线3.2%,在WikiTQ和TableBench上分别获得79.4%和61.2%的准确率。