Controllable Graph Generation with Diffusion Models via Inference-Time Tree Search Guidance

Graph generation is a fundamental problem in graph learning with broad applications across Web-scale systems, knowledge graphs, and scientific domains such as drug and material discovery. Recent approaches leverage diffusion models for step-by-step generation, yet unconditional diffusion offers little control over desired properties, often leading to unstable quality and difficulty in incorporating new objectives. Inference-time guidance methods mitigate these issues by adjusting the sampling process without retraining, but they remain inherently local, heuristic, and limited in controllability. To overcome these limitations, we propose TreeDiff, a Monte Carlo Tree Search (MCTS) guided dual-space diffusion framework for controllable graph generation. TreeDiff is a plug-and-play inference-time method that expands the search space while keeping computation tractable. Specifically, TreeDiff introduces three key designs to make it practical and scalable: (1) a macro-step expansion strategy that groups multiple denoising updates into a single transition, reducing tree depth and enabling long-horizon exploration; (2) a dual-space denoising mechanism that couples efficient latent-space denoising with lightweight discrete correction in graph space, ensuring both scalability and structural fidelity; and (3) a dual-space verifier that predicts long-term rewards from partially denoised graphs, enabling early value estimation and removing the need for full rollouts. Extensive experiments on 2D and 3D molecular generation benchmarks, under both unconditional and conditional settings, demonstrate that TreeDiff achieves state-of-the-art performance. Notably, TreeDiff exhibits favorable inference-time scaling: it continues to improve with additional computation, while existing inference-time methods plateau early under limited resources.

翻译：图生成是图学习中的一个基础性问题，在Web规模系统、知识图谱以及药物与材料发现等科学领域具有广泛的应用。近期方法利用扩散模型进行逐步生成，然而无条件扩散对期望属性的控制能力有限，常导致质量不稳定且难以融入新目标。推理时引导方法通过调整采样过程而无需重新训练来缓解这些问题，但它们本质上仍是局部、启发式的，且可控性有限。为克服这些局限性，我们提出了TreeDiff，一种基于蒙特卡洛树搜索（MCTS）引导的双空间扩散框架，用于可控图生成。TreeDiff是一种即插即用的推理时方法，能在保持计算可处理性的同时扩展搜索空间。具体而言，TreeDiff引入了三项关键设计以使其具备实用性和可扩展性：（1）宏步扩展策略，将多个去噪更新组合为单次状态转移，从而降低树深度并支持长程探索；（2）双空间去噪机制，将高效的隐空间去噪与图空间中的轻量级离散校正相结合，确保可扩展性和结构保真度；（3）双空间验证器，可从部分去噪的图中预测长期奖励，实现早期价值估计并避免完整推演的需求。在二维和三维分子生成基准测试上，无论无条件还是有条件设置下的大量实验表明，TreeDiff实现了最先进的性能。值得注意的是，TreeDiff展现出优越的推理时扩展性：随着计算资源的增加，其性能持续提升，而现有的推理时方法在有限资源下很早就达到性能瓶颈。