Graphs are crucial for representing interrelated data and aiding predictive modeling by capturing complex relationships. Achieving high-quality graph representation is important for identifying linked patterns, leading to improvements in Graph Neural Networks (GNNs) to better capture data structures. However, challenges such as data scarcity, high collection costs, and ethical concerns limit progress. As a result, generative models and data augmentation have become more and more popular. This study explores using generated graphs for data augmentation, comparing the performance of combining generated graphs with real graphs, and examining the effect of different quantities of generated graphs on graph classification tasks. The experiments show that balancing scalability and quality requires different generators based on graph size. Our results introduce a new approach to graph data augmentation, ensuring consistent labels and enhancing classification performance.
翻译:图对于表示相互关联的数据至关重要,并通过捕获复杂关系来辅助预测建模。获得高质量的图表示对于识别关联模式非常重要,这促进了图神经网络(GNNs)的改进,以更好地捕捉数据结构。然而,数据稀缺、高收集成本和伦理问题等挑战限制了进展。因此,生成模型和数据增强变得越来越受欢迎。本研究探讨了使用生成图进行数据增强的方法,比较了生成图与真实图结合的性能,并检验了不同数量的生成图对图分类任务的影响。实验表明,平衡可扩展性和质量需要基于图大小的不同生成器。我们的结果引入了一种新的图数据增强方法,确保标签一致性并提升分类性能。