Fact-centric question answering (QA) often requires access to multiple, heterogeneous, information sources. By jointly considering several sources like a knowledge base (KB), a text collection, and tables from the web, QA systems can enhance their answer coverage and confidence. However, existing QA benchmarks are mostly constructed with a single source of knowledge in mind. This limits capabilities of these benchmarks to fairly evaluate QA systems that can tap into more than one information repository. To bridge this gap, we release CompMix, a crowdsourced QA benchmark which naturally demands the integration of a mixture of input sources. CompMix has a total of 9,410 questions, and features several complex intents like joins and temporal conditions. Evaluation of a range of QA systems on CompMix highlights the need for further research on leveraging information from heterogeneous sources.
翻译:以事实为中心的问答(QA)通常需要访问多种异构信息源。通过联合考虑知识库、文本语料和网页表格等多种信息源,问答系统能够提升答案覆盖范围与置信度。然而现有QA基准数据集大多基于单一知识源构建,这限制了它们公正评估能利用多个信息存储库的问答系统的能力。为填补这一空白,我们发布了CompMix——一个自然需要融合多种输入源的众包QA基准数据集。该数据集包含9,410个问题,并具有连接查询与时间条件等复杂意图特征。在CompMix上对各类问答系统的评估凸显了进一步研究异构信息源利用方法的必要性。