Graph Sphere: From Nodes to Supernodes in Graphical Models

High-dimensional data analysis typically focuses on low-dimensional structure, often to aid interpretation and computational efficiency. Graphical models provide a powerful methodology for learning the conditional independence structure in multivariate data by representing variables as nodes and dependencies as edges. Inference is often focused on individual edges in the latent graph. Nonetheless, there is increasing interest in determining more complex structures, such as communities of nodes, for multiple reasons, including more effective information retrieval and better interpretability. In this work, we propose a multilayer graphical model where we first cluster nodes and then, at the second layer, investigate the relationships among groups of nodes. Specifically, nodes are partitioned into "supernodes" with a data-coherent size-biased tessellation prior which combines ideas from Bayesian nonparametrics and Voronoi tessellations. This construct allows accounting also for dependence of nodes within supernodes. At the second layer, dependence structure among supernodes is modelled through a Gaussian graphical model, where the focus of inference is on "superedges". We provide theoretical justification for our modelling choices. We design tailored Markov chain Monte Carlo schemes, which also enable parallel computations. We demonstrate the effectiveness of our approach for large-scale structure learning in simulations and a transcriptomics application.

翻译：高维数据分析通常关注低维结构，这通常有助于解释和计算效率。图模型通过将变量表示为节点、依赖关系表示为边，为学习多元数据中的条件独立结构提供了强大方法。推理通常集中于潜在图中的单条边。然而，出于多重原因（包括更有效的信息检索和更好的可解释性），对确定更复杂结构（如节点社区）的兴趣日益增加。在本工作中，我们提出了一种多层图模型：首先对节点进行聚类，然后在第二层研究节点组之间的关系。具体而言，节点通过一种数据自洽的尺寸有偏剖面先验被划分为"超节点"，该先验结合了贝叶斯非参数方法和沃罗诺伊剖面的思想。这种构造还能考虑超节点内节点的依赖性。在第二层，超节点间的依赖结构通过高斯图模型建模，推理重点在于"超边"。我们为建模选择提供了理论依据，设计了定制的马尔可夫链蒙特卡洛方案（支持并行计算），并通过模拟实验和转录组学应用证明了该方法在大规模结构学习中的有效性。

相关内容

关注 2

《图形模型》是国际公认的高评价的顶级期刊，专注于图形模型的创建、几何处理、动画和可视化，以及它们在工程、科学、文化和娱乐方面的应用。GMOD为其读者提供了经过彻底审查和精心挑选的论文，这些论文传播令人兴奋的创新，传授严谨的理论基础，提出健壮和有效的解决方案，或描述各种主题中的雄心勃勃的系统或应用程序。官网地址：http://dblp.uni-trier.de/db/journals/cvgip/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日