A drug molecule is a substance that changes the organism's mental or physical state. Every approved drug has an indication, which refers to the therapeutic use of that drug for treating a particular medical condition. While the Large Language Model (LLM), a generative Artificial Intelligence (AI) technique, has recently demonstrated effectiveness in translating between molecules and their textual descriptions, there remains a gap in research regarding their application in facilitating the translation between drug molecules and indications, or vice versa, which could greatly benefit the drug discovery process. The capability of generating a drug from a given indication would allow for the discovery of drugs targeting specific diseases or targets and ultimately provide patients with better treatments. In this paper, we first propose a new task, which is the translation between drug molecules and corresponding indications, and then test existing LLMs on this new task. Specifically, we consider nine variations of the T5 LLM and evaluate them on two public datasets obtained from ChEMBL and DrugBank. Our experiments show the early results of using LLMs for this task and provide a perspective on the state-of-the-art. We also emphasize the current limitations and discuss future work that has the potential to improve the performance on this task. The creation of molecules from indications, or vice versa, will allow for more efficient targeting of diseases and significantly reduce the cost of drug discovery, with the potential to revolutionize the field of drug discovery in the era of generative AI.
翻译:药物分子是一种能够改变生物体精神或生理状态的物质。每种获批药物均具有适应症,即该药物用于治疗特定疾病的治疗用途。尽管作为生成式人工智能技术的大型语言模型(LLM)近期已在分子与其文本描述之间的翻译中展现出有效性,但在促进药物分子与适应症(或反之亦然)的双向翻译研究方面仍存在空白——此类翻译若能实现,将极大推动药物发现进程。从给定适应症生成药物的能力,将有助于发现针对特定疾病或靶点的药物,最终为患者提供更优治疗方案。本文首先提出一项新任务——药物分子与对应适应症之间的翻译,继而测试现有LLM在该任务上的表现。具体而言,我们考虑了T5 LLM的九种变体,并在源自ChEMBL和DrugBank的两个公开数据集上对其进行评估。实验展示了LLM在此任务中的初步成果,并提供了当前最优水平的视角。我们同时强调了现有局限性,并讨论了具有提升此任务性能潜力的未来研究方向。从适应症生成分子(或反之亦然)将能更高效地靶向疾病,显著降低药物发现成本,有望在生成式AI时代颠覆药物发现领域。