Query-focused summarization (QFS) is a challenging task in natural language processing that generates summaries to address specific queries. The broader field of Generative Information Retrieval (Gen-IR) aims to revolutionize information extraction from vast document corpora through generative approaches, encompassing Generative Document Retrieval (GDR) and Grounded Answer Retrieval (GAR). This paper highlights the role of QFS in Grounded Answer Generation (GAR), a key subdomain of Gen-IR that produces human-readable answers in direct correspondence with queries, grounded in relevant documents. In this study, we propose QontSum, a novel approach for QFS that leverages contrastive learning to help the model attend to the most relevant regions of the input document. We evaluate our approach on a couple of benchmark datasets for QFS and demonstrate that it either outperforms existing state-of-the-art or exhibits a comparable performance with considerably reduced computational cost through enhancements in the fine-tuning stage, rather than relying on large-scale pre-training experiments, which is the focus of current SOTA. Moreover, we conducted a human study and identified improvements in the relevance of generated summaries to the posed queries without compromising fluency. We further conduct an error analysis study to understand our model's limitations and propose avenues for future research.
翻译:查询聚焦摘要(QFS)是自然语言处理中的一项具有挑战性的任务,旨在生成针对特定查询的摘要。生成式信息检索(Gen-IR)这一更广泛的领域旨在通过生成式方法革新从大规模文档语料库中提取信息的过程,涵盖生成式文档检索(GDR)和基于基础证据的答案检索(GAR)。本文强调QFS在基于基础证据的答案生成(GAR)中的作用——后者是Gen-IR的关键子领域,其目标是根据查询直接生成可读的答案,并以相关文档为依托。在本研究中,我们提出QontSum,一种新颖的QFS方法,它利用对比学习帮助模型关注输入文档中最相关的区域。我们在两个基准数据集上评估了该方法,结果表明,与现有最先进方法相比,我们的方法或实现更优性能,或在显著降低计算成本的同时表现相当——这得益于对微调阶段的改进,而非依赖大规模预训练实验(当前最先进方法的重点)。此外,我们开展了一项人工研究,发现生成的摘要与所提查询的相关性得到提升,且流畅性未受影响。我们还进行了误差分析研究以理解模型的局限性,并提出了未来研究方向。