The efficient exploration of chemical space to design molecules with intended properties enables the accelerated discovery of drugs, materials, and catalysts, and is one of the most important outstanding challenges in chemistry. Encouraged by the recent surge in computer power and artificial intelligence development, many algorithms have been developed to tackle this problem. However, despite the emergence of many new approaches in recent years, comparatively little progress has been made in developing realistic benchmarks that reflect the complexity of molecular design for real-world applications. In this work, we develop a set of practical benchmark tasks relying on physical simulation of molecular systems mimicking real-life molecular design problems for materials, drugs, and chemical reactions. Additionally, we demonstrate the utility and ease of use of our new benchmark set by demonstrating how to compare the performance of several well-established families of algorithms. Overall, we believe that our benchmark suite will help move the field towards more realistic molecular design benchmarks, and move the development of inverse molecular design algorithms closer to the practice of designing molecules that solve existing problems in both academia and industry alike.
翻译:摘要:高效探索化学空间以设计具有目标属性的分子,可加速药物、材料和催化剂的发现,是当前化学领域最具挑战性的核心问题之一。受近年来计算能力提升和人工智能发展的推动,许多算法已被开发用于解决这一难题。然而,尽管近年涌现出众多新方法,在开发能够反映真实世界应用中分子设计复杂性的实用基准测试方面,进展相对有限。本文通过构建一系列基于物理模拟的实用基准任务,精准复现材料、药物和化学反应等真实分子设计问题,并基于此开发了一套新型基准测试集。此外,我们通过对比多个经典算法家族的运行性能,展示了该基准测试集的实用性与易用性。总体而言,我们相信该基准测试平台将推动该领域向更真实的分子设计基准迈进,促进逆向分子设计算法的发展更贴近学术与工业领域中解决实际问题的分子设计实践。