Developing effective anticancer therapeutics remains challenging due to tumor heterogeneity and the absence of well-defined molecular targets across cancer subtypes. Generative models conditioned on cancer genotypes offer a promising avenue for personalized drug discovery, yet existing approaches lack explicit optimization for simultaneous sensitivity, synthesizability, and mechanistic binding plausibility. We present a latent-space optimization approach for a pretrained genotype-to-drug diffusion model, introducing a learnable perturbation over the molecular latent space optimized via gradient ascent to maximize a composite reward combining predicted drug sensitivity (AUC), drug-likeness (QED), and synthetic accessibility (SAS). Critically, biological realism is enforced by grounding both reward design and evaluation in experimentally-derived cancer cell line data and validated pharmacologic signals, anchoring candidate generation in real-world clinical evidence. Mechanistic consistency plausibility is further assessed by a multi-agent LLM pipeline grounded in the diffusion model's attention mechanism. Experiments across 15 cancer cell lines from three held-out evaluation sets demonstrate consistent and noticeable improvements over competing baselines in sensitivity, drug-likeness, synthesizability, and chemical validity.
翻译:开发有效的抗癌治疗药物仍面临挑战,这归因于肿瘤异质性以及跨癌症亚型缺乏明确的分子靶点。以癌症基因型为条件的生成模型为个性化药物发现提供了有前景的途径,但现有方法缺乏同时对敏感性、可合成性和机制结合合理性的显式优化。我们提出了一种预训练基因型到药物扩散模型的潜空间优化方法,在分子潜空间上引入可学习的扰动,通过梯度上升进行优化,以最大化由预测药物敏感性(AUC)、药物相似性(QED)和合成可及性(SAS)组成的复合奖励函数。关键的是,通过将奖励设计和评估锚定于实验来源的癌细胞系数据和经过验证的药理学信号,我们强加了生物现实性约束,将候选生成物锚定在真实的临床证据上。机制一致性合理性进一步通过多智能体大语言模型管道进行评估,该管道基于扩散模型的注意力机制。在来自三个留出评估集的15种癌细胞系上的实验表明,在敏感性、药物相似性、合成可及性和化学有效性方面,我们的方法相较于竞争基线取得了持续且显著的改进。