A Bayesian Network is a directed acyclic graph (DAG) on a set of $n$ random variables (the vertices); a Bayesian Network Distribution (BND) is a probability distribution on the random variables that is Markovian on the graph. A finite $k$-mixture of such models is graphically represented by a larger graph which has an additional "hidden" (or "latent") random variable $U$, ranging in $\{1,\ldots,k\}$, and a directed edge from $U$ to every other vertex. Models of this type are fundamental to causal inference, where $U$ models an unobserved confounding effect of multiple populations, obscuring the causal relationships in the observable DAG. By solving the mixture problem and recovering the joint probability distribution on $U$, traditionally unidentifiable causal relationships become identifiable. Using a reduction to the more well-studied "product" case on empty graphs, we give the first algorithm to learn mixtures of non-empty DAGs.
翻译:贝叶斯网络是一组$n$个随机变量(顶点)上的有向无环图(DAG);贝叶斯网络分布(BND)是随机变量上对该图具有马尔可夫性质的概率分布。这类模型的有限$k$混合由图解表示为一个更大的图,该图具有一个额外的"隐藏"(或"潜在")随机变量$U$,取值范围为$\{1,\ldots,k\}$,且存在从$U$到每个其他顶点的有向边。此类模型是因果推断的基础,其中$U$对多个总体的未观测混杂效应进行建模,掩盖了可观测DAG中的因果关系。通过解决混合问题并恢复$U$上的联合概率分布,传统上不可识别的因果关系变得可识别。利用对空图上更成熟的"乘积"情况的归约,我们给出了首个学习非空DAG混合的算法。