Graph generation has been dominated by autoregressive models due to their simplicity and effectiveness, despite their sensitivity to ordering. Yet diffusion models have garnered increasing attention, as they offer comparable performance while being permutation-invariant. Current graph diffusion models generate graphs in a one-shot fashion, but they require extra features and thousands of denoising steps to achieve optimal performance. We introduce PARD, a Permutation-invariant Auto Regressive Diffusion model that integrates diffusion models with autoregressive methods. PARD harnesses the effectiveness and efficiency of the autoregressive model while maintaining permutation invariance without ordering sensitivity. Specifically, we show that contrary to sets, elements in a graph are not entirely unordered and there is a unique partial order for nodes and edges. With this partial order, PARD generates a graph in a block-by-block, autoregressive fashion, where each block's probability is conditionally modeled by a shared diffusion model with an equivariant network. To ensure efficiency while being expressive, we further propose a higher-order graph transformer, which integrates transformer with PPGN. Like GPT, we extend the higher-order graph transformer to support parallel training of all blocks. Without any extra features, PARD achieves state-of-the-art performance on molecular and non-molecular datasets, and scales to large datasets like MOSES containing 1.9M molecules.
翻译:摘要:尽管自回归模型因简洁高效而主导图生成任务,但其对排序敏感。而扩散模型因具备置换不变性且性能相当,正日益受到关注。当前图扩散模型采用一次性生成方式,但需额外特征与数千步去噪才能达到最优性能。我们提出PARD——一种融合扩散模型与自回归方法的置换不变自回归扩散模型。PARD在保持置换不变性(无需排序敏感性)的同时,兼具自回归模型的高效性。具体而言,与集合不同,图中元素并非完全无序,节点与边存在唯一偏序关系。基于该偏序,PARD以逐块自回归方式生成图,每个块的概率由共享扩散模型(含等变网络)条件建模。为兼顾表达力与效率,我们进一步提出高阶图变换器,将Transformer与PPGN融合。类似GPT,我们扩展该高阶图变换器以支持所有块的并行训练。无需额外特征,PARD在分子与非分子数据集上均达最优性能,并可扩展至含190万分子的MOSES等大规模数据集。