Generative models for 3D molecular conformations must respect Euclidean symmetries and concentrate probability mass on thermodynamically favorable, mechanically stable structures. However, E(3)-equivariant diffusion models often reproduce biases from semi-empirical training data rather than capturing the equilibrium distribution of a high-fidelity Hamiltonian. While physics-based guidance can correct this, it faces two computational bottlenecks: expensive quantum-chemical evaluations (e.g., DFT) and the need to repeat such queries at every sampling step. We present Elign, a post-training framework that amortizes both costs. First, we replace expensive DFT evaluations with a faster, pretrained foundational machine-learning force field (MLFF) to provide physical signals. Second, we eliminate repeated run-time queries by shifting physical steering to the training phase. To achieve the second amortization, we formulate reverse diffusion as a reinforcement learning problem and introduce Force--Energy Disentangled Group Relative Policy Optimization (FED-GRPO) to fine-tune the denoising policy. FED-GRPO includes a potential-based energy reward and a force-based stability reward, which are optimized and group-normalized independently. Experiments show that Elign generates conformations with lower gold-standard DFT energies and forces, while improving stability. Crucially, inference remains as fast as unguided sampling, since no energy evaluations are required during generation.
翻译:三维分子构象的生成模型必须遵循欧几里得对称性,并将概率质量集中于热力学有利且机械稳定的结构。然而,E(3)-等变扩散模型往往复现半经验训练数据中的偏差,而非捕获高精度哈密顿量的平衡分布。虽然基于物理的引导可以纠正这一问题,但它面临两个计算瓶颈:昂贵的量子化学计算(如DFT)以及在每个采样步骤中重复此类查询的需求。我们提出了Elign,一种摊销这两项成本的后训练框架。首先,我们使用更快的预训练基础机器学习力场(MLFF)替代昂贵的DFT计算,以提供物理信号。其次,通过将物理引导移至训练阶段,我们消除了运行时的重复查询。为实现第二项摊销,我们将反向扩散表述为强化学习问题,并引入力-能量解耦分组相对策略优化(FED-GRPO)来微调去噪策略。FED-GRPO包含基于势能的能量奖励和基于力的稳定性奖励,两者被独立优化并进行分组归一化。实验表明,Elign生成的构象具有更低的金标准DFT能量和力,同时提升了稳定性。关键的是,由于生成过程中无需能量计算,推理速度与无引导采样同样快速。