Unifying gradient regularization for Heterogeneous Graph Neural Networks

Heterogeneous Graph Neural Networks (HGNNs) are a class of powerful deep learning methods widely used to learn representations of heterogeneous graphs. Despite the fast development of HGNNs, they still face some challenges such as over-smoothing, and non-robustness. Previous studies have shown that these problems can be reduced by using gradient regularization methods. However, the existing gradient regularization methods focus on either graph topology or node features. There is no universal approach to integrate these features, which severely affects the efficiency of regularization. In addition, the inclusion of gradient regularization into HGNNs sometimes leads to some problems, such as an unstable training process, increased complexity and insufficient coverage regularized information. Furthermore, there is still short of a complete theoretical analysis of the effects of gradient regularization on HGNNs. In this paper, we propose a novel gradient regularization method called Grug, which iteratively applies regularization to the gradients generated by both propagated messages and the node features during the message-passing process. Grug provides a unified framework integrating graph topology and node features, based on which we conduct a detailed theoretical analysis of their effectiveness. Specifically, the theoretical analyses elaborate the advantages of Grug: 1) Decreasing sample variance during the training process (Stability); 2) Enhancing the generalization of the model (Universality); 3) Reducing the complexity of the model (Simplicity); 4) Improving the integrity and diversity of graph information utilization (Diversity). As a result, Grug has the potential to surpass the theoretical upper bounds set by DropMessage (AAAI-23 Distinguished Papers). In addition, we evaluate Grug on five public real-world datasets with two downstream tasks...

翻译：异构图神经网络（HGNNs）是一类强大的深度学习方法，被广泛用于学习异构图数据的表示。尽管HGNNs发展迅速，但仍面临过度平滑和非鲁棒性等挑战。已有研究表明，梯度正则化方法可缓解这些问题。然而，现有梯度正则化方法仅聚焦于图拓扑结构或节点特征的单一维度，缺乏整合这些特征的通用方法，严重制约了正则化效率。此外，在HGNNs中引入梯度正则化有时会导致训练过程不稳定、复杂度增加及正则化信息覆盖不足等问题。更重要的是，目前仍缺乏关于梯度正则化对HGNNs影响的完整理论分析。本文提出一种名为Grug的新型梯度正则化方法，该方法在消息传递过程中迭代对传播消息和节点特征产生的梯度施加正则化。Grug提供了整合图拓扑结构与节点特征的统一框架，并基于此框架对其有效性进行了详细理论分析。具体而言，理论分析阐明了Grug的四大优势：1）降低训练过程中的样本方差（稳定性）；2）增强模型的泛化能力（普适性）；3）降低模型复杂度（简洁性）；4）提升图信息利用的完整性与多样性（多样性）。因此，Grug具备超越DropMessage（AAAI-23杰出论文）理论上限的潜力。此外，我们在五个公开真实数据集上通过两类下游任务评估了Grug的性能...