Collaborative Expert LLMs Guided Multi-Objective Molecular Optimization

Molecular optimization is a crucial yet complex and time-intensive process that often acts as a bottleneck for drug development. Traditional methods rely heavily on trial and error, making multi-objective optimization both time-consuming and resource-intensive. Current AI-based methods have shown limited success in handling multi-objective optimization tasks, hampering their practical utilization. To address this challenge, we present MultiMol, a collaborative large language model (LLM) system designed to guide multi-objective molecular optimization. MultiMol comprises two agents, including a data-driven worker agent and a literature-guided research agent. The data-driven worker agent is a large language model being fine-tuned to learn how to generate optimized molecules considering multiple objectives, while the literature-guided research agent is responsible for searching task-related literature to find useful prior knowledge that facilitates identifying the most promising optimized candidates. In evaluations across six multi-objective optimization tasks, MultiMol significantly outperforms existing methods, achieving a 82.30% success rate, in sharp contrast to the 27.50% success rate of current strongest methods. To further validate its practical impact, we tested MultiMol on two real-world challenges. First, we enhanced the selectivity of Xanthine Amine Congener (XAC), a promiscuous ligand that binds both A1R and A2AR, successfully biasing it towards A1R. Second, we improved the bioavailability of Saquinavir, an HIV-1 protease inhibitor with known bioavailability limitations. Overall, these results indicate that MultiMol represents a highly promising approach for multi-objective molecular optimization, holding great potential to accelerate the drug development process and contribute to the advancement of pharmaceutical research.

翻译：分子优化是药物研发中至关重要但复杂且耗时的过程，常常成为药物开发的瓶颈。传统方法严重依赖试错，使得多目标优化既耗时又耗费资源。当前基于人工智能的方法在处理多目标优化任务方面取得的成功有限，阻碍了其实际应用。为应对这一挑战，我们提出了MultiMol，一个旨在引导多目标分子优化的协作式大语言模型系统。MultiMol包含两个智能体：一个数据驱动的工作智能体和一个文献引导的研究智能体。数据驱动的工作智能体是一个经过微调的大语言模型，学习如何考虑多个目标生成优化分子；而文献引导的研究智能体负责搜索任务相关文献，以寻找有助于识别最有前景优化候选分子的有用先验知识。在六项多目标优化任务的评估中，MultiMol显著优于现有方法，取得了82.30%的成功率，与当前最强方法27.50%的成功率形成鲜明对比。为进一步验证其实用价值，我们在两个实际挑战中测试了MultiMol。首先，我们提升了黄嘌呤胺类似物（XAC）的选择性——这是一种能同时结合A1R和A2AR的多配体，成功使其偏向A1R。其次，我们改善了沙奎那韦的生物利用度，这是一种已知存在生物利用度限制的HIV-1蛋白酶抑制剂。总体而言，这些结果表明MultiMol代表了一种极具前景的多目标分子优化方法，在加速药物研发进程和推动药学进步方面具有巨大潜力。