Macrocycles are ring-shaped molecules that offer a promising alternative to small-molecule drugs due to their enhanced selectivity and binding affinity against difficult targets. Despite their chemical value, they remain underexplored in generative modeling, likely owing to their scarcity in public datasets and the challenges of enforcing topological constraints in standard deep generative models. We introduce MacroGuide: Topological Guidance for Macrocycle Generation, a diffusion guidance mechanism that uses Persistent Homology to steer the sampling of pretrained molecular generative models toward the generation of macrocycles, in both unconditional and conditional (protein pocket) settings. At each denoising step, MacroGuide constructs a Vietoris-Rips complex from atomic positions and promotes ring formation by optimizing persistent homology features. Empirically, applying MacroGuide to pretrained diffusion models increases macrocycle generation rates from 1% to 99%, while matching or exceeding state-of-the-art performance on key quality metrics such as chemical validity, diversity, and PoseBusters checks.
翻译:大环化合物是一类环状分子,因其对难成药靶点具有更高的选择性和结合亲和力,为小分子药物提供了一种有前景的替代方案。尽管具有化学价值,但它们在生成模型中仍未得到充分探索,这可能是由于公共数据集中此类分子的稀缺性,以及在标准深度生成模型中强制执行拓扑约束的挑战。我们提出了MacroGuide:大环化合物生成的拓扑引导,这是一种扩散引导机制,它利用持续同调(Persistent Homology)来引导预训练的分子生成模型在无条件以及条件(蛋白质口袋)设置下采样生成大环化合物。在每个去噪步骤中,MacroGuide根据原子位置构建Vietoris-Rips复形,并通过优化持续同调特征来促进环的形成。实验表明,将MacroGuide应用于预训练的扩散模型,可将大环化合物的生成率从1%提高到99%,同时在化学有效性、多样性以及PoseBusters检验等关键质量指标上达到或超越了最先进模型的性能。