Continuous-Time Dynamic Graph (CTDG) precisely models evolving real-world relationships, drawing heightened interest in dynamic graph learning across academia and industry. However, existing CTDG models encounter challenges stemming from noise and limited historical data. Graph Data Augmentation (GDA) emerges as a critical solution, yet current approaches primarily focus on static graphs and struggle to effectively address the dynamics inherent in CTDGs. Moreover, these methods often demand substantial domain expertise for parameter tuning and lack theoretical guarantees for augmentation efficacy. To address these issues, we propose Conda, a novel latent diffusion-based GDA method tailored for CTDGs. Conda features a sandwich-like architecture, incorporating a Variational Auto-Encoder (VAE) and a conditional diffusion model, aimed at generating enhanced historical neighbor embeddings for target nodes. Unlike conventional diffusion models trained on entire graphs via pre-training, Conda requires historical neighbor sequence embeddings of target nodes for training, thus facilitating more targeted augmentation. We integrate Conda into the CTDG model and adopt an alternating training strategy to optimize performance. Extensive experimentation across six widely used real-world datasets showcases the consistent performance improvement of our approach, particularly in scenarios with limited historical data.
翻译:连续时间动态图(CTDG)能够精确建模现实世界中不断演化的关系,因此在学术界和工业界引起了动态图学习的高度关注。然而,现有的CTDG模型面临着由噪声和有限历史数据带来的挑战。图数据增强(GDA)成为一种关键的解决方案,但当前方法主要集中于静态图,难以有效处理CTDG固有的动态特性。此外,这些方法通常需要大量领域专业知识进行参数调优,并且缺乏关于增强效果的理论保证。为解决这些问题,我们提出了Conda,一种专为CTDG设计的新型基于潜在扩散的GDA方法。Conda采用三明治式架构,包含一个变分自编码器(VAE)和一个条件扩散模型,旨在为目标节点生成增强的历史邻居嵌入。与通过预训练在整个图上进行训练的传统扩散模型不同,Conda仅需目标节点的历史邻居序列嵌入进行训练,从而实现更具针对性的增强。我们将Conda集成到CTDG模型中,并采用交替训练策略以优化性能。在六个广泛使用的真实世界数据集上进行的大量实验表明,我们的方法能够带来持续的性能提升,尤其是在历史数据有限的场景下。