Undirected probabilistic graphical models represent the conditional dependencies, or Markov properties, of a collection of random variables. Knowing the sparsity of such a graphical model is valuable for modeling multivariate distributions and for efficiently performing inference. While the problem of learning graph structure from data has been studied extensively for certain parametric families of distributions, most existing methods fail to consistently recover the graph structure for non-Gaussian data. Here we propose an algorithm for learning the Markov structure of continuous and non-Gaussian distributions. To characterize conditional independence, we introduce a score based on integrated Hessian information from the joint log-density, and we prove that this score upper bounds the conditional mutual information for a general class of distributions. To compute the score, our algorithm SING estimates the density using a deterministic coupling, induced by a triangular transport map, and iteratively exploits sparse structure in the map to reveal sparsity in the graph. For certain non-Gaussian datasets, we show that our algorithm recovers the graph structure even with a biased approximation to the density. Among other examples, we apply SING to learn the dependencies between the states of a chaotic dynamical system with local interactions.
翻译:无向概率图模型表示一组随机变量的条件依赖关系(即马尔可夫性质)。了解此类图模型的稀疏性对于建模多元分布和高效执行推断至关重要。尽管在特定参数化分布族中,从数据学习图结构的问题已被广泛研究,但现有方法大多无法一致地恢复非高斯数据的图结构。本文提出了一种用于学习连续且非高斯分布的马尔可夫结构的算法。为刻画条件独立性,我们引入了一种基于联合对数密度积分黑塞信息的评分,并证明该评分对一类一般分布的上界为条件互信息。为计算该评分,我们的算法SING通过由三角传输映射诱导的确定性耦合估计密度,并迭代利用映射中的稀疏结构以揭示图的稀疏性。对于某些非高斯数据集,我们证明即使密度估计存在偏差,该算法仍能恢复图结构。在众多实例中,我们将SING应用于学习具有局部相互作用的混沌动力系统状态间的依赖关系。