Table-and-text hybrid question answering (HybridQA) is a widely used and challenging NLP task commonly applied in the financial and scientific domain. The early research focuses on migrating other QA task methods to HybridQA, while with further research, more and more HybridQA-specific methods have been present. With the rapid development of HybridQA, the systematic survey is still under-explored to summarize the main techniques and advance further research. So we present this work to summarize the current HybridQA benchmarks and methods, then analyze the challenges and future directions of this task. The contributions of this paper can be summarized in three folds: (1) first survey, to our best knowledge, including benchmarks, methods and challenges for HybridQA; (2) systematic investigation with the reasonable comparison of the existing systems to articulate their advantages and shortcomings; (3) detailed analysis of challenges in four important dimensions to shed light on future directions.
翻译:表格与文本混合的混合问答(HybridQA)是一项广泛应用于金融和科学领域的具有挑战性的自然语言处理任务。早期研究侧重于将其他问答任务的方法迁移到HybridQA中,而随着研究的深入,越来越多针对HybridQA的专用方法被提出。尽管HybridQA发展迅速,但系统性的综述仍相对匮乏,难以总结主要技术并推动进一步研究。为此,本文旨在梳理当前HybridQA的基准数据集和方法,分析该任务的挑战与未来方向。本文贡献可归纳为三点:(1)据我们所知,首次对HybridQA的基准、方法与挑战进行了系统性综述;(2)对现有系统进行合理对比,阐明其优势与不足;(3)从四个关键维度深入分析挑战,为未来研究方向提供启示。