Table Question-Answering involves both understanding the natural language query and grounding it in the context of the input table to extract the relevant information. In this context, many methods have highlighted the benefits of intermediate pre-training from SQL queries. However, while most approaches aim at generating final answers from inputs directly, we claim that there is better to do with SQL queries during training. By learning to imitate a restricted portion of SQL-like algebraic operations, we show that their execution flow provides intermediate supervision steps that allow increased generalization and structural reasoning compared with classical approaches of the field. Our study bridges the gap between semantic parsing and direct answering methods and provides useful insights regarding what types of operations should be predicted by a generative architecture or be preferably executed by an external algorithm.
翻译:表格问答涉及理解自然语言查询并将其与输入表格的上下文关联以提取相关信息。在此背景下,许多方法已强调对SQL查询进行中间预训练的益处。然而,尽管大多数方法旨在直接从输入生成最终答案,我们主张在训练过程中利用SQL查询可取得更优效果。通过模仿SQL类代数运算的受限部分,我们证明其执行流程提供的中间监督步骤,与领域内经典方法相比,能提升泛化能力与结构推理水平。本研究弥合了语义解析与直接回答方法之间的差距,并就生成式架构应预测哪些类型的运算、或应由外部算法优先执行哪些运算提供了有益见解。