Generating desirable molecular structures in 3D is a fundamental problem for drug discovery. Despite the considerable progress we have achieved, existing methods usually generate molecules in atom resolution and ignore intrinsic local structures such as rings, which leads to poor quality in generated structures, especially when generating large molecules. Fragment-based molecule generation is a promising strategy, however, it is nontrivial to be adapted for 3D non-autoregressive generations because of the combinational optimization problems. In this paper, we utilize a coarse-to-fine strategy to tackle this problem, in which a Hierarchical Diffusion-based model (i.e.~HierDiff) is proposed to preserve the validity of local segments without relying on autoregressive modeling. Specifically, HierDiff first generates coarse-grained molecule geometries via an equivariant diffusion process, where each coarse-grained node reflects a fragment in a molecule. Then the coarse-grained nodes are decoded into fine-grained fragments by a message-passing process and a newly designed iterative refined sampling module. Lastly, the fine-grained fragments are then assembled to derive a complete atomic molecular structure. Extensive experiments demonstrate that HierDiff consistently improves the quality of molecule generation over existing methods
翻译:生成理想的三维分子结构是药物发现的基本问题。尽管已取得显著进展,现有方法通常以原子分辨率生成分子,忽略环等内在局部结构,导致生成结构质量低下,尤其在生成大分子时。基于片段的分子生成是一种有前景的策略,但由于组合优化问题,难以直接适配于三维非自回归生成。本文利用粗到细策略解决该问题,提出基于层次扩散的模型(即HierDiff),在不依赖自回归建模的情况下保持局部片段的有效性。具体而言,HierDiff首先通过等变扩散过程生成粗粒度分子几何结构,其中每个粗粒度节点对应分子中的一个片段。随后,通过消息传递过程和新设计的迭代细化采样模块,将粗粒度节点解码为细粒度片段。最后,组装细粒度片段以获得完整的原子级分子结构。大量实验表明,HierDiff在分子生成质量上持续优于现有方法。