Sparse graph recovery methods works well where the data follows their assumptions but often they are not designed for doing downstream probabilistic queries. This limits their adoption to only identifying connections among the input variables. On the other hand, the Probabilistic Graphical Models (PGMs) assumes an underlying base graph between variables and learns a distribution over them. PGM design choices are carefully made such that the inference & sampling algorithms are efficient. This brings in certain restrictions and often simplifying assumptions. In this work, we propose Neural Graph Revealers (NGRs), that are an attempt to efficiently merge the sparse graph recovery methods with PGMs into a single flow. The problem setting consists of an input data X with D features and M samples and the task is to recover a sparse graph showing connection between the features. NGRs view the neural networks as a `white box' or more specifically as a multitask learning framework. We introduce `Graph-constrained path norm' that NGRs leverage to learn a graphical model that captures complex non-linear functional dependencies between the features in the form of an undirected sparse graph. Furthermore, NGRs can handle multimodal inputs like images, text, categorical data, embeddings etc. which is not straightforward to incorporate in the existing methods. We show experimental results of doing sparse graph recovery and probabilistic inference on data from Gaussian graphical models and a multimodal infant mortality dataset by CDC.
翻译:稀疏图恢复方法在数据符合其假设时表现良好,但往往并非为执行下游概率查询而设计,这限制了它们仅能用于识别输入变量间的连接。另一方面,概率图模型(PGMs)假设变量间存在底层基图,并学习其上的分布。PGM的设计选择经过精心安排,以确保推理与采样算法高效,这带来了某些限制和经常性的简化假设。在本工作中,我们提出神经图揭示器(NGRs),尝试将稀疏图恢复方法与PGMs高效融合为单一流程。问题设定包括具有D个特征和M个样本的输入数据X,任务是恢复显示特征间连接的稀疏图。NGRs将神经网络视为"白盒",更具体地,作为多任务学习框架。我们引入"图约束路径范数",NGRs利用它学习一个图模型,以无向稀疏图的形式捕捉特征间复杂的非线性函数依赖关系。此外,NGRs可处理多模态输入(如图像、文本、分类数据、嵌入等),而这些难以直接整合到现有方法中。我们展示了在来自高斯图模型的数据和美国疾控中心的多模态婴儿死亡率数据集上进行稀疏图恢复与概率推理的实验结果。