Many graph representation learning (GRL) problems are dynamic, with millions of edges added or removed per second. A fundamental workload in this setting is dynamic link prediction: using a history of graph updates to predict whether a given pair of vertices will become connected. Recent schemes for link prediction in such dynamic settings employ Transformers, modeling individual graph updates as single tokens. In this work, we propose HOT: a model that enhances this line of works by harnessing higher-order (HO) graph structures; specifically, k-hop neighbors and more general subgraphs containing a given pair of vertices. Harnessing such HO structures by encoding them into the attention matrix of the underlying Transformer results in higher accuracy of link prediction outcomes, but at the expense of increased memory pressure. To alleviate this, we resort to a recent class of schemes that impose hierarchy on the attention matrix, significantly reducing memory footprint. The final design offers a sweetspot between high accuracy and low memory utilization. HOT outperforms other dynamic GRL schemes, for example achieving 9%, 7%, and 15% higher accuracy than - respectively - DyGFormer, TGN, and GraphMixer, for the MOOC dataset. Our design can be seamlessly extended towards other dynamic GRL workloads.
翻译:许多图表示学习问题具有动态特性,每秒可能新增或删除数百万条边。在此场景中,动态链路预测是一项基础任务:利用图更新的历史数据预测给定顶点对是否将建立连接。现有动态链路预测方案采用Transformer架构,将单个图更新建模为独立词元。本文提出HOT模型,通过利用高阶图结构(具体包括k跳邻居及包含给定顶点对的更一般子图)增强此类方法。通过将此类高阶结构编码至底层Transformer的注意力矩阵中,可提升链路预测准确率,但会加剧内存压力。为缓解该问题,我们采用近期提出的注意力矩阵层次化分解方案,显著降低内存占用。最终设计在高准确率与低内存消耗之间实现了最优平衡。在MOOC数据集上,HOT相较DyGFormer、TGN和GraphMixer分别取得9%、7%和15%的准确率提升,性能超越其他动态图表示学习方案。本设计可无缝扩展至其他动态图表示学习任务。