Message Passing Neural Networks (MPNNs) are a staple of graph machine learning. MPNNs iteratively update each node's representation in an input graph by aggregating messages from the node's neighbors, which necessitates a memory complexity of the order of the number of graph edges. This complexity might quickly become prohibitive for large graphs provided they are not very sparse. In this paper, we propose a novel approach to alleviate this problem by approximating the input graph as an intersecting community graph (ICG) -- a combination of intersecting cliques. The key insight is that the number of communities required to approximate a graph does not depend on the graph size. We develop a new constructive version of the Weak Graph Regularity Lemma to efficiently construct an approximating ICG for any input graph. We then devise an efficient graph learning algorithm operating directly on ICG in linear memory and time with respect to the number of nodes (rather than edges). This offers a new and fundamentally different pipeline for learning on very large non-sparse graphs, whose applicability is demonstrated empirically on node classification tasks and spatio-temporal data processing.
翻译:消息传递神经网络(MPNNs)是图机器学习的基础模型。MPNNs通过聚合节点邻居的信息迭代更新输入图中每个节点的表示,这需要与图的边数成正比的内存复杂度。对于非高度稀疏的大规模图,这种复杂度可能迅速变得难以承受。本文提出一种新方法,通过将输入图近似为相交社区图(ICG)——即相交团的组合——来缓解该问题。关键洞见在于:近似一个图所需的社区数量不依赖于图的规模。我们发展了弱图正则引理的一个新构造性版本,以高效地为任意输入图构建近似ICG。随后,我们设计了一种直接在ICG上运行的、内存和时间复杂度均与节点数(而非边数)呈线性关系的高效图学习算法。这为超大规模非稀疏图的学习提供了一条全新且本质不同的技术路径,其适用性在节点分类任务和时空数据处理中得到了实证验证。