Table summarization is a crucial task aimed at condensing information from tabular data into concise and comprehensible textual summaries. However, existing approaches often fall short of adequately meeting users' information and quality requirements and tend to overlook the complexities of real-world queries. In this paper, we propose a novel method to address these limitations by introducing query-focused multi-table summarization. Our approach, which comprises a table serialization module, a summarization controller, and a large language model (LLM), utilizes textual queries and multiple tables to generate query-dependent table summaries tailored to users' information needs. To facilitate research in this area, we present a comprehensive dataset specifically tailored for this task, consisting of 4909 query-summary pairs, each associated with multiple tables. Through extensive experiments using our curated dataset, we demonstrate the effectiveness of our proposed method compared to baseline approaches. Our findings offer insights into the challenges of complex table reasoning for precise summarization, contributing to the advancement of research in query-focused multi-table summarization.
翻译:表格摘要是一项关键任务,旨在将表格数据中的信息压缩成简洁且易于理解的文本摘要。然而,现有方法往往无法充分满足用户的信息与质量需求,且倾向于忽略现实查询的复杂性。在本文中,我们提出了一种新方法以解决这些局限,引入了查询聚焦的多表摘要生成。我们的方法包含一个表格序列化模块、一个摘要控制器和一个大型语言模型(LLM),能够利用文本查询和多个表格生成依赖于查询的表格摘要,以满足用户的信息需求。为促进该领域的研究,我们提供了一个专门为此任务设计的综合数据集,包含4909个查询-摘要对,每个对与多个表格相关联。通过使用我们整理的数据集进行大量实验,我们证明了所提方法相较于基线方法的有效性。我们的研究结果为复杂表推理在精确摘要中的挑战提供了见解,有助于推动查询聚焦多表摘要生成研究的进展。