Drug response prediction (DRP) is a crucial phase in drug discovery, and the most important metric for its evaluation is the IC50 score. DRP results are heavily dependent on the quality of the generated molecules. Existing molecule generation methods typically employ classifier-based guidance, enabling sampling within the IC50 classification range. However, these methods fail to ensure the sampling space range's effectiveness, generating numerous ineffective molecules. Through experimental and theoretical study, we hypothesize that conditional generation based on the target IC50 score can obtain a more effective sampling space. As a result, we introduce regressor-free guidance molecule generation to ensure sampling within a more effective space and support DRP. Regressor-free guidance combines a diffusion model's score estimation with a regression controller model's gradient based on number labels. To effectively map regression labels between drugs and cell lines, we design a common-sense numerical knowledge graph that constrains the order of text representations. Experimental results on the real-world dataset for the DRP task demonstrate our method's effectiveness in drug discovery. The code is available at:https://anonymous.4open.science/r/RMCD-DBD1.
翻译:药物反应预测(DRP)是药物发现中的关键阶段,其评估的最重要指标是IC50分数。DRP结果在很大程度上依赖于生成分子的质量。现有的分子生成方法通常采用基于分类器的引导,使得采样能够在IC50分类范围内进行。然而,这些方法未能确保采样空间范围的有效性,生成了大量无效分子。通过实验和理论研究,我们假设基于目标IC50分数的条件生成能够获得更有效的采样空间。因此,我们引入了无回归器引导分子生成,以确保在更有效的空间内采样并支持DRP。无回归器引导将扩散模型的分数估计与基于数值标签的回归控制器模型的梯度相结合。为了有效映射药物与细胞系之间的回归标签,我们设计了一个常识数值知识图谱,用于约束文本表示的顺序。在DRP任务真实数据集上的实验结果证明了我们方法在药物发现中的有效性。代码可在以下网址获取:https://anonymous.4open.science/r/RMCD-DBD1。