Graphical models are usually employed to represent statistical relationships between pairs of variables when all the remaining variables are fixed. In this picture, conditionally independent pairs are disconnected. In the real world, however, strict conditional independence is almost impossible to prove. Here we use a weaker version of the concept of graphical models, in which only the linear component of the conditional dependencies is represented. This notion enables us to relate the marginal Pearson correlation coefficient (a measure of linear marginal dependence) with the partial correlations (a measure of linear conditional dependence). Specifically, we use the graphical model to express the marginal Pearson correlation $\rho_{ij}$ between variables $X_i$ and $X_j$ as a sum of the efficacies with which messages propagate along all the paths connecting the variables in the graph. The expansion is convergent, and provides a mechanistic interpretation of how global correlations arise from local interactions. Moreover, by weighing the relevance of each path and of each intermediate node, an intuitive way to imagine interventions is enabled, revealing for example what happens when a given edge is pruned, or the weight of an edge is modified. The expansion is also useful to construct minimal equivalent models, in which latent variables are introduced to replace a larger number of marginalised variables. In addition, the expansion yields an alternative algorithm to calculate marginal Pearson correlations, particularly beneficial when partial correlation matrix inversion is difficult. Finally, for Gaussian variables, the mutual information is also related to message-passing efficacies along paths in the graph.
翻译:图模型通常用于表示当所有其他变量固定时,变量对之间的统计关系。在该框架下,条件独立的变量对被视作无连接。然而在现实世界中,严格的条件独立性几乎无法证明。本文采用图模型概念的弱化版本,仅表示条件依赖关系中的线性成分。该概念使我们能够将边际皮尔逊相关系数(线性边际依赖的度量)与偏相关系数(线性条件依赖的度量)相关联。具体而言,我们利用图模型将变量$X_i$和$X_j$之间的边际皮尔逊相关系数$\rho_{ij}$表示为消息沿图中连接两变量的所有路径传播效能的叠加。该展开具有收敛性,为全局相关性如何由局域相互作用产生提供了机理诠释。此外,通过评估每条路径及每个中间节点的重要性,该展开提供了一种直观的干预设想方法,例如揭示特定边被剪除或权重修改时的影响。该展开也有助于构建最小等价模型——通过引入潜变量替代大量边缘化变量。同时,该展开提供了计算边际皮尔逊相关系数的替代算法,在偏相关矩阵难以求逆时尤为实用。最后,对于高斯变量,互信息同样可关联为图中路径上的消息传递效能。