A Bayesian Network is a directed acyclic graph (DAG) on a set of $n$ random variables (the vertices); a Bayesian Network Distribution (BND) is a probability distribution on the random variables that is Markovian on the graph. A finite $k$-mixture of such models is graphically represented by a larger graph which has an additional "hidden" (or "latent") random variable $U$, ranging in $\{1,\ldots,k\}$, and a directed edge from $U$ to every other vertex. Models of this type are fundamental to causal inference, where $U$ models an unobserved confounding effect of multiple populations, obscuring the causal relationships in the observable DAG. By solving the mixture problem and recovering the joint probability distribution on $U$, traditionally unidentifiable causal relationships become identifiable. Using a reduction to the more well-studied "product" case on empty graphs, we give the first algorithm to learn mixtures of non-empty DAGs.
翻译:贝叶斯网络是一组$n$个随机变量(顶点)上的有向无环图(DAG);贝叶斯网络分布(BND)是该图上的马尔可夫概率分布。这类模型的有限$k$混合通过一个更大的图来表示,该图增加了一个额外的“隐藏”(或“潜在”)随机变量$U$(取值范围为$\{1,\ldots,k\}$),并存在从$U$到其他每个顶点的有向边。此类模型是因果推断的基础,其中$U$模拟了多个总体的未观测混杂效应,掩盖了观测DAG中的因果关系。通过解决混合问题并恢复$U$的联合概率分布,传统上不可识别的因果关系变得可识别。利用对空图的更经典“乘积”情形的归约,我们提出了首个学习非空DAG混合的算法。