Molecule Design by Latent Prompt Transformer

This paper proposes a latent prompt Transformer model for solving challenging optimization problems such as molecule design, where the goal is to find molecules with optimal values of a target chemical or biological property that can be computed by an existing software. Our proposed model consists of three components. (1) A latent vector whose prior distribution is modeled by a Unet transformation of a Gaussian white noise vector. (2) A molecule generation model that generates the string-based representation of molecule conditional on the latent vector in (1). We adopt the causal Transformer model that takes the latent vector in (1) as prompt. (3) A property prediction model that predicts the value of the target property of a molecule based on a non-linear regression on the latent vector in (1). We call the proposed model the latent prompt Transformer model. After initial training of the model on existing molecules and their property values, we then gradually shift the model distribution towards the region that supports desired values of the target property for the purpose of molecule design. Our experiments show that our proposed model achieves state of the art performances on several benchmark molecule design tasks.

翻译：本文提出了一种潜在提示Transformer模型，用于解决诸如分子设计等具有挑战性的优化问题，其目标是找到能通过现有软件计算的目标化学或生物性质最优值的分子。所提模型由三个组件构成：(1) 一个潜在向量，其先验分布由高斯白噪声向量的Unet变换建模；(2) 一个分子生成模型，该模型基于(1)中潜在向量的条件生成分子的字符串表示。我们采用因果Transformer模型，将(1)中的潜在向量作为提示；(3) 一个性质预测模型，该模型通过对(1)中潜在向量进行非线性回归，预测分子目标性质的值。我们将所提模型称为潜在提示Transformer模型。在对现有分子及其性质值进行初始训练后，我们逐步将模型分布向支持目标性质期望值的区域偏移，以实现分子设计的目的。实验表明，所提模型在多个分子设计基准任务上取得了最优性能。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/