Every SQL statement is limited to return a single, possibly denormalized, table. This design decision has far reaching consequences. (1.) for databases users in terms of slow query performance, long query result transfer times, usability-issues of SQL in web applications and object-relational mappers. In addition, (2.) for database architects it has consequences when designing query optimizers leading to logical (algebraic) join enumeration effort, memory consumption for intermediate result materialization, and physical operator selection effort. So basically, the entire query optimization stack is shaped by that design decision. In this paper, we argue that the single-table limitation should be dropped. We extend the SELECT-clause of SQL by a keyword 'RESULTDB' to support returning a result database. Our approach has clear semantics, i.e. our extended SQL returns subsets of all tables with only those tuples that would be part of the traditional (single-table) query result set, however without performing any denormalization through joins. Our SQL-extension is downward compatible. Moreover, we discuss the surprisingly long list of benefits of our approach. First, for database users: far simpler and more readable application code, better query performance, smaller query results, better query result transfer times. Second, for database architects, we present how to leverage existing closed source systems as well as change open source database systems to support our feature. We propose a couple of algorithms to integrate our feature into both closed-source as well as open source database systems. We present an initial experimental study with promising results.
翻译:每一条SQL语句都局限于返回一个单表,且可能为非规范化形式。这一设计决策产生了深远的影响:(1)对于数据库用户而言,导致查询性能低下、查询结果传输时间长、SQL在Web应用及对象关系映射中的可用性问题;(2)对于数据库架构师而言,在设计查询优化器时,需要承担逻辑(代数)连接枚举开销、中间结果物化的内存消耗以及物理运算符选择开销。实际上,整个查询优化栈都受这一设计决策的影响。本文认为,应当摒弃单表返回的限制。我们通过在SQL的SELECT子句中引入关键字'RESULTDB',支持返回结果数据库。该方法具有明确的语义:扩展后的SQL能够返回所有表中仅包含传统(单表)查询结果集中元组的子集,且无需通过连接进行非规范化操作。该SQL扩展具有向下兼容性。此外,我们讨论了该方法出人意料的诸多优势。首先,对于数据库用户而言:应用代码更简洁易读、查询性能更优、查询结果更精简、查询结果传输时间更短。其次,对于数据库架构师,我们展示了如何利用现有闭源系统以及改造开源数据库系统以支持该特性。我们提出了多种算法,将该特性集成至闭源及开源数据库系统中。初步实验研究取得了令人鼓舞的结果。