Graph transformers have recently received significant attention in graph learning, partly due to their ability to capture more global interaction via self-attention. Nevertheless, while higher-order graph neural networks have been reasonably well studied, the exploration of extending graph transformers to higher-order variants is just starting. Both theoretical understanding and empirical results are limited. In this paper, we provide a systematic study of the theoretical expressive power of order-$k$ graph transformers and sparse variants. We first show that, an order-$k$ graph transformer without additional structural information is less expressive than the $k$-Weisfeiler Lehman ($k$-WL) test despite its high computational cost. We then explore strategies to both sparsify and enhance the higher-order graph transformers, aiming to improve both their efficiency and expressiveness. Indeed, sparsification based on neighborhood information can enhance the expressive power, as it provides additional information about input graph structures. In particular, we show that a natural neighborhood-based sparse order-$k$ transformer model is not only computationally efficient, but also expressive -- as expressive as $k$-WL test. We further study several other sparse graph attention models that are computationally efficient and provide their expressiveness analysis. Finally, we provide experimental results to show the effectiveness of the different sparsification strategies.
翻译:图Transformer近年来在图学习领域受到显著关注,部分原因在于其通过自注意力机制能够捕获更全局的交互。然而,尽管高阶图神经网络已得到较为充分的研究,将图Transformer扩展至高阶变体的探索才刚刚起步。其理论理解与实证结果均十分有限。本文系统研究了k阶图Transformer及其稀疏变体的理论表达能力。我们首先证明,尽管计算成本高昂,但未引入额外结构信息的k阶图Transformer的表达能力弱于k-韦费勒-莱曼(k-WL)测试。随后,我们探索了稀疏化与增强高阶图Transformer的策略,旨在提升其效率与表达能力。事实上,基于邻域信息的稀疏化能够增强表达能力,因为它提供了关于输入图结构的额外信息。特别地,我们证明基于自然邻域的稀疏k阶Transformer模型不仅计算高效,且具有表达能力——与k-WL测试表达能力相当。我们进一步研究了其他几种计算高效的稀疏图注意力模型,并提供了其表达能力分析。最后,我们通过实验结果展示了不同稀疏化策略的有效性。