Deep generative models have been accelerating the inverse design process in material and drug design. Unlike their counterpart property predictors in typical molecular design frameworks, generative molecular design models have seen fewer efforts on uncertainty quantification (UQ) due to computational challenges in Bayesian inference posed by their large number of parameters. In this work, we focus on the junction-tree variational autoencoder (JT-VAE), a popular model for generative molecular design, and address this issue by leveraging the low dimensional active subspace to capture the uncertainty in the model parameters. Specifically, we approximate the posterior distribution over the active subspace parameters to estimate the epistemic model uncertainty in an extremely high dimensional parameter space. The proposed UQ scheme does not require alteration of the model architecture, making it readily applicable to any pre-trained model. Our experiments demonstrate the efficacy of the AS-based UQ and its potential impact on molecular optimization by exploring the model diversity under epistemic uncertainty.
翻译:深度生成模型正在加速材料与药物设计中的逆向设计过程。与典型分子设计框架中的性能预测模型不同,由于生成分子设计模型参数量庞大带来的贝叶斯推断计算挑战,其在不确定性量化方面的研究相对较少。本文聚焦于生成分子设计的经典模型——联合树变分自编码器(JT-VAE),通过利用低维主动子空间捕捉模型参数的不确定性来解决该问题。具体而言,我们近似主动子空间参数的后验分布,以在极高维参数空间中估计认知模型不确定性。所提出的不确定性量化方案无需修改模型架构,可直接应用于任何预训练模型。实验表明,基于主动子空间的不确定性量化方法具有有效性,并通过探索认知不确定性下的模型多样性展示了其在分子优化中的潜在影响。