Optimizing techniques for discovering molecular structures with desired properties is crucial in artificial intelligence(AI)-based drug discovery. Combining deep generative models with reinforcement learning has emerged as an effective strategy for generating molecules with specific properties. Despite its potential, this approach is ineffective in exploring the vast chemical space and optimizing particular chemical properties. To overcome these limitations, we present Mol-AIR, a reinforcement learning-based framework using adaptive intrinsic rewards for effective goal-directed molecular generation. Mol-AIR leverages the strengths of both history-based and learning-based intrinsic rewards by exploiting random distillation network and counting-based strategies. In benchmark tests, Mol-AIR demonstrates superior performance over existing approaches in generating molecules with desired properties without any prior knowledge, including penalized LogP, QED, and celecoxib similarity. We believe that Mol-AIR represents a significant advancement in drug discovery, offering a more efficient path to discovering novel therapeutics.
翻译:优化发现具有期望性质的分子结构的技术在基于人工智能的药物发现中至关重要。将深度生成模型与强化学习相结合已成为生成具有特定性质分子的有效策略。尽管具有潜力,这种方法在探索广阔的化学空间和优化特定化学性质方面仍显低效。为克服这些限制,我们提出Mol-AIR,一种基于强化学习的框架,利用自适应内在奖励进行高效的目标导向分子生成。Mol-AIR通过利用随机蒸馏网络和基于计数的策略,融合了基于历史和基于学习的内在奖励的优势。在基准测试中,Mol-AIR在无需任何先验知识的情况下,生成具有期望性质(包括惩罚LogP、QED和塞来昔布相似性)的分子方面展现出优于现有方法的性能。我们相信Mol-AIR代表了药物发现领域的重要进步,为发现新型治疗药物提供了更高效的途径。