Generative molecular optimization aims to design molecules with properties surpassing those of existing compounds. However, such candidates are rare and expensive to evaluate, yielding sample efficiency essential. Additionally, surrogate models introduced to predict molecule evaluations, suffer from distribution shift as optimization drives candidates increasingly out-of-distribution. To address these challenges, we introduce Joint Self-Improvement, which benefits from (i) a joint generative-predictive model and (ii) a self-improving sampling scheme. The former aligns the generator with the surrogate, alleviating distribution shift, while the latter biases the generative part of the joint model using the predictive one to efficiently generate optimized molecules at inference-time. Experiments across offline and online molecular optimization benchmarks demonstrate that Joint Self-Improvement outperforms state-of-the-art methods under limited evaluation budgets.
翻译:生成式分子优化旨在设计出性质超越现有化合物的分子。然而,这类候选分子稀少且评估成本高昂,因此样本效率至关重要。此外,为预测分子评估而引入的代理模型,会因优化过程驱使候选分子日益偏离训练数据分布而遭受分布偏移问题。为应对这些挑战,我们提出了联合自改进方法,其优势在于:(i)一个联合的生成-预测模型,以及(ii)一种自改进的采样策略。前者通过使生成器与代理模型对齐来缓解分布偏移,而后者则在推理阶段利用预测模型对联合模型的生成部分进行偏置,从而高效地生成优化后的分子。在离线和在线分子优化基准测试上的实验表明,在有限的评估预算下,联合自改进方法优于现有最先进方法。