Graph generation poses a significant challenge as it involves predicting a complete graph with multiple nodes and edges based on simply a given label. This task also carries fundamental importance to numerous real-world applications, including de-novo drug and molecular design. In recent years, several successful methods have emerged in the field of graph generation. However, these approaches suffer from two significant shortcomings: (1) the underlying Graph Neural Network (GNN) architectures used in these methods are often underexplored; and (2) these methods are often evaluated on only a limited number of metrics. To fill this gap, we investigate the expressiveness of GNNs under the context of the molecular graph generation task, by replacing the underlying GNNs of graph generative models with more expressive GNNs. Specifically, we analyse the performance of six GNNs on six different molecular generative objectives on the ZINC-250k dataset in two different generative frameworks: autoregressive generation models, such as GCPN and GraphAF, and one-shot generation models, such as GraphEBM. Through our extensive experiments, we demonstrate that advanced GNNs can indeed improve the performance of GCPN, GraphAF, and GraphEBM on molecular generation tasks, but GNN expressiveness is not a necessary condition for a good GNN-based generative model. Moreover, we show that GCPN and GraphAF with advanced GNNs can achieve state-of-the-art results across 17 other non-GNN-based graph generative approaches, such as variational autoencoders and Bayesian optimisation models, on the proposed molecular generative objectives (DRD2, Median1, Median2), which are important metrics for de-novo molecular design.
翻译:图生成是一项重大挑战,因为它涉及根据给定标签预测包含多个节点和边的完整图。该任务对包括全新药物和分子设计在内的众多实际应用也具有根本重要性。近年来,图生成领域涌现出多种成功方法。然而,这些方法存在两个显著缺陷:(1)这些方法中使用的底层图神经网络架构往往未得到充分探索;(2)这些方法通常仅基于有限数量的指标进行评估。为弥补这一不足,我们通过将图生成模型的底层GNN替换为更具表现力的GNN,研究在分子图生成任务背景下GNN的表达能力。具体而言,我们在两种不同生成框架(自回归生成模型如GCPN和GraphAF,以及一次性生成模型如GraphEBM)中,分析了六种GNN在ZINC-250k数据集上针对六种不同分子生成目标的性能。通过广泛实验,我们证明先进GNN确实能提升GCPN、GraphAF和GraphEBM在分子生成任务上的表现,但GNN表达力并非构建优秀GNN生成模型的必要条件。此外,我们表明采用先进GNN的GCPN和GraphAF在提出的分子生成目标(DRD2、Median1、Median2)上,能够超越其他17种非基于GNN的图生成方法(如变分自编码器和贝叶斯优化模型)取得最先进成果——这些目标对全新分子设计而言是关键指标。