We address the problems of giving a semantics to- and doing query answering (QA) on a relational database (RDB) that has missing values (MVs). The causes for the latter are governed by a Missingness Mechanism that is modelled as a Bayesian Network, which represents a Missingness Graph (MG) and involves the DB attributes. Our approach considerable departs from the treatment of RDBs with NULL (values). The MG together with the observed DB allow to build a block-independent probabilistic DB, on which basis we propose two QA techniques that jointly capture probabilistic uncertainty and statistical plausibility of the implicit imputation of MVs. We obtain complexity results that characterize the computational feasibility of those approaches.
翻译:本文探讨了在含有缺失值的关系数据库中进行语义定义与查询回答的问题。缺失值的产生受缺失机制控制,该机制被建模为贝叶斯网络,从而形成包含数据库属性的缺失图。我们的方法与处理含NULL值的关系数据库有本质区别。通过缺失图与观测到的数据库,可以构建一个块独立概率数据库。基于此,我们提出两种查询回答技术,能够同时捕捉隐式缺失值填补的概率不确定性与统计合理性。我们获得了表征这两种方法计算可行性的复杂性结果。