Relational databases play an important role in this Big Data era. However, it is challenging for non-experts to fully unleash the analytical power of relational databases, since they are not familiar with database languages such as SQL. Many techniques have been proposed to automatically generate SQL from natural language, but they suffer from two issues: (1) they still make many mistakes, particularly for complex queries, and (2) they do not provide a flexible way for non-expert users to validate and refine the incorrect queries. To address these issues, we introduce a new interaction mechanism that allows users directly edit a step-by-step explanation of an incorrect SQL to fix SQL errors. Experiments on the Spider benchmark show that our approach outperforms three SOTA approaches by at least 31.6% in terms of execution accuracy. A user study with 24 participants further shows that our approach helped users solve significantly more SQL tasks with less time and higher confidence, demonstrating its potential to expand access to databases, particularly for non-experts.
翻译:关系数据库在大数据时代发挥着重要作用。然而,非专家用户难以充分释放关系数据库的分析潜力,因为他们不熟悉SQL等数据库语言。已有许多技术被提出用于从自然语言自动生成SQL,但这些方法存在两个问题:(1)它们仍会犯许多错误,尤其是在处理复杂查询时;(2)它们未能为非专家用户提供灵活的方式来验证和修正错误查询。为解决这些问题,我们引入了一种新的交互机制,允许用户直接编辑错误SQL的逐步解释以修复SQL错误。在Spider基准上的实验表明,我们的方法在执行准确率上至少比三种最先进方法高出31.6%。一项包含24名参与者的用户研究进一步显示,我们的方法帮助用户以更少的时间和更高的置信度解决了显著更多的SQL任务,这证明了其在扩展数据库访问方面的潜力,尤其对非专家用户而言。