Since the inception of SQL, nulls have frustrated database users and builders alike. Those writing SQL must painstakingly guard their queries against surprising results caused by nulls, while those building database engines constantly struggle to implement the subtle semantics of 3-valued logic. Given that the relational model already provides a way to represent missing information, namely, with the absence of a tuple in a relation, one may step back and ask:"Are nulls really necessary?" We answer:"No!" by proposing a new semantics for SQL that completely eliminates nulls. Our semantics, called Columnar Semantics, is as expressive as the standard 3-valued logic semantics, and behaves the same when the data and query are null-free. Where the two semantics differ, Columnar Semantics results in simpler queries. To evaluate Columnar Semantics and any other alternative semantics or query languages, we propose MIA (Missing Information Artifacts), a collection of queries and data sets for handling missing information, and invite contributions from the community.
翻译:自SQL诞生以来,NULL值便持续困扰着数据库用户与构建者。编写SQL的用户必须谨慎防护查询免受NULL值引发的意外结果影响,而构建数据库引擎的人员则不断挣扎于实现三值逻辑的微妙语义。鉴于关系模型已提供表达缺失信息的方式(即关系中元组的缺失),我们不妨退一步追问:"NULL值真的必要吗?"我们的回答是:"不!"并提出了一种彻底消除NULL值的SQL新语义。这种名为列式语义(Columnar Semantics)的语义具有与标准三值逻辑相同的表达力,且在数据与查询不含NULL值时行为一致。当两种语义存在差异时,列式语义将产生更简洁的查询。为评估列式语义及其他替代语义或查询语言,我们提出了MIA(缺失信息人工制品)——一套用于处理缺失信息的查询与数据集集合,并诚邀社区贡献。