Sparse graph recovery methods work well where the data follows their assumptions but often they are not designed for doing downstream probabilistic queries. This limits their adoption to only identifying connections among the input variables. On the other hand, the Probabilistic Graphical Models (PGMs) assume an underlying base graph between variables and learns a distribution over them. PGM design choices are carefully made such that the inference \& sampling algorithms are efficient. This brings in certain restrictions and often simplifying assumptions. In this work, we propose Neural Graph Revealers (NGRs), that are an attempt to efficiently merge the sparse graph recovery methods with PGMs into a single flow. The problem setting consists of an input data X with D features and M samples and the task is to recover a sparse graph showing connection between the features and learn a probability distribution over the D at the same time. NGRs view the neural networks as a `glass box' or more specifically as a multitask learning framework. We introduce `Graph-constrained path norm' that NGRs leverage to learn a graphical model that captures complex non-linear functional dependencies between the features in the form of an undirected sparse graph. Furthermore, NGRs can handle multimodal inputs like images, text, categorical data, embeddings etc. which is not straightforward to incorporate in the existing methods. We show experimental results of doing sparse graph recovery and probabilistic inference on data from Gaussian graphical models and a multimodal infant mortality dataset by Centers for Disease Control and Prevention.
翻译:稀疏图恢复方法在数据符合其假设时表现良好,但通常它们并非为执行下游概率查询而设计。这限制了它们仅能识别输入变量间的连接关系。另一方面,概率图模型假定变量间存在基础图结构并学习其上的分布。PGM的设计选择经过精心安排,使得推理与采样算法高效运行,但这带来了某些限制和常见简化假设。本文提出神经图揭示器,旨在将稀疏图恢复方法与PGM高效融合为单一流程。问题设定包含具有D个特征和M个样本的输入数据X,任务是在恢复显示特征间连接的稀疏图的同时学习D维概率分布。NGR将神经网络视为“玻璃箱”,更具体地说,是一个多任务学习框架。我们引入“图约束路径范数”,NGR利用它学习能够反映特征间复杂非线性函数依赖关系的无向稀疏图模型。此外,NGR可处理图像、文本、类别型数据、嵌入等多模态输入,而现有方法难以直接整合这些数据类型。我们展示了在来自高斯图模型的数据以及美国疾病控制与预防中心提供的多模态婴儿死亡率数据集上进行稀疏图恢复与概率推理的实验结果。