Enabling question answering over tables and databases in natural language has become a key capability in the democratization of insights from tabular data sources. These systems first require retrieval of data that is relevant to a given natural language query, for which several methods have been introduced. In this work we present and study a table retrieval mechanism devising fine-grained typed query decomposition and global connectivity-awareness (DCTR), to handle the challenges induced by open-domain question answering over relational databases in complex usage contexts. We evaluate the effectiveness of the two mechanisms through the lens of retrieval complexity which we measure along the axes of query- and data complexity. Our analyses over industry-aligned benchmarks illustrate the robustness of DCTR for highly composite queries and densely connected databases.
翻译:通过自然语言实现表格和数据库的问答能力,已成为从表格数据源中民主化获取洞察的关键技术。这类系统首先需要检索与给定自然语言查询相关的数据,为此已有多种方法被提出。本研究提出并研究了一种表格检索机制,该机制设计了细粒度类型化查询分解与全局连接感知(DCTR),以应对复杂使用场景下关系数据库开放域问答所带来的挑战。我们通过检索复杂性的视角评估这两种机制的有效性,该复杂性沿查询复杂度和数据复杂度两个维度进行度量。基于工业级基准测试的分析表明,DCTR对于高度复合查询和密集连接数据库具有鲁棒性。