Training on large-scale graphs has achieved remarkable results in graph representation learning, but its cost and storage have raised growing concerns. As one of the most promising directions, graph condensation methods address these issues by employing gradient matching, aiming to condense the full graph into a more concise yet information-rich synthetic set. Though encouraging, these strategies primarily emphasize matching directions of the gradients, which leads to deviations in the training trajectories. Such deviations are further magnified by the differences between the condensation and evaluation phases, culminating in accumulated errors, which detrimentally affect the performance of the condensed graphs. In light of this, we propose a novel graph condensation method named \textbf{C}raf\textbf{T}ing \textbf{R}ationa\textbf{L} trajectory (\textbf{CTRL}), which offers an optimized starting point closer to the original dataset's feature distribution and a more refined strategy for gradient matching. Theoretically, CTRL can effectively neutralize the impact of accumulated errors on the performance of condensed graphs. We provide extensive experiments on various graph datasets and downstream tasks to support the effectiveness of CTRL. Code is released at https://github.com/NUS-HPC-AI-Lab/CTRL.
翻译:在大规模图上的训练在图表示学习中取得了显著成果,但其成本和存储日益引发关注。作为最有前景的方向之一,图压缩方法通过采用梯度匹配来解决这些问题,旨在将完整图压缩为更简洁但信息丰富的合成集。尽管前景可期,但这些策略主要强调梯度的匹配方向,导致训练轨迹出现偏差。这种偏差因压缩阶段与评估阶段之间的差异而进一步放大,最终累积为误差,对压缩图的性能产生不利影响。鉴于此,我们提出了一种新颖的图压缩方法,名为**Craf**t**ing R**ationa**L**轨迹(**CTRL**),该方法提供了更接近原始数据集特征分布的优化起点,以及更精细的梯度匹配策略。理论上,CTRL能有效中和累积误差对压缩图性能的影响。我们在多种图数据集和下游任务上进行了广泛实验,以支持CTRL的有效性。代码已发布于https://github.com/NUS-HPC-AI-Lab/CTRL。