Conditional-independence-based discovery uses statistical tests to identify a graphical model that represents the independence structure of variables in a dataset. These tests, however, can be unreliable, and algorithms are sensitive to errors and violated assumptions. Often, there are tests that were not used in the construction of the graph. In this work, we show that these redundant tests have the potential to detect or sometimes correct errors in the learned model. But we further show that not all tests contain this additional information and that such redundant tests have to be applied with care. Precisely, we argue that the conditional (in)dependence statements that hold for every probability distribution are unlikely to detect and correct errors - in contrast to those that follow only from graphical assumptions.
翻译:基于条件独立性的发现方法利用统计检验来识别表示数据集中变量独立结构的图模型。然而,这些检验可能不可靠,且算法对错误和假设违反较为敏感。通常,存在一些未用于图构建的检验。在本工作中,我们表明这些冗余检验有潜力检测甚至纠正学习模型中的错误。但我们进一步指出,并非所有检验都包含这种额外信息,且此类冗余检验需谨慎使用。具体而言,我们认为对每个概率分布都成立的条件(非)独立性陈述不太可能检测和纠正错误——这与仅基于图假设推导出的陈述形成对比。