The graphical structure of Probabilistic Graphical Models (PGMs) represents the conditional independence (CI) relations that hold in the modeled distribution. Every separator in the graph represents a conditional independence relation in the distribution, making them the vehicle through which new conditional independencies are inferred and verified. The notion of separation in graphs depends on whether the graph is directed (i.e., a Bayesian Network), or undirected (i.e., a Markov Network). The premise of all current systems-of-inference for deriving CIs in PGMs, is that the set of CIs used for the construction of the PGM hold exactly. In practice, algorithms for extracting the structure of PGMs from data discover approximate CIs that do not hold exactly in the distribution. In this paper, we ask how the error in this set propagates to the inferred CIs read off the graphical structure. More precisely, what guarantee can we provide on the inferred CI when the set of CIs that entailed it hold only approximately? It has recently been shown that in the general case, no such guarantee can be provided. In this work, we prove new negative and positive results concerning this problem. We prove that separators in undirected PGMs do not necessarily represent approximate CIs. That is, no guarantee can be provided for CIs inferred from the structure of undirected graphs. We prove that such a guarantee exists for the set of CIs inferred in directed graphical models, making the $d$-separation algorithm a sound and complete system for inferring approximate CIs. We also establish improved approximation guarantees for independence relations derived from marginal and saturated CIs.
翻译:概率图模型的图形结构表示所建模分布中成立的条件独立性关系。图中的每一个分隔符都代表分布中的一条条件独立性关系,使其成为推断和验证新条件独立性的载体。图中分隔的概念取决于图是有向的(即贝叶斯网络)还是无向的(即马尔可夫网络)。当前所有用于推导概率图模型中条件独立性的推理系统都基于一个前提:用于构建概率图模型的条件独立性集合精确成立。然而在实践中,从数据中提取概率图模型结构的算法所发现的近似条件独立性在分布中并非精确成立。本文探究的问题是:这一集合中的误差如何传播到从图形结构中读取的推断条件独立性?更精确地说,当蕴含某个推断条件独立性的条件独立性集合仅近似成立时,我们能对该推断条件独立性提供何种保证?近期研究表明,在一般情况下无法提供此类保证。本研究针对该问题证明了新的否定性和肯定性结果。我们证明,无向概率图模型中的分隔符不一定代表近似条件独立性——即无法对从无向图结构推断的条件独立性提供任何保证;同时我们证明,在有向图模型中推断的条件独立性集合存在此类保证,使得$d$-分离算法成为推断近似条件独立性的可靠完备系统。此外,我们还为从边缘条件独立性和饱和条件独立性推导的独立关系建立了改进的近似保证。