Molecular conformation generation (MCG) is a fundamental and important problem in drug discovery. Many traditional methods have been developed to solve the MCG problem, such as systematic searching, model-building, random searching, distance geometry, molecular dynamics, Monte Carlo methods, etc. However, they have some limitations depending on the molecular structures. Recently, there are plenty of deep learning based MCG methods, which claim they largely outperform the traditional methods. However, to our surprise, we design a simple and cheap algorithm (parameter-free) based on the traditional methods and find it is comparable to or even outperforms deep learning based MCG methods in the widely used GEOM-QM9 and GEOM-Drugs benchmarks. In particular, our design algorithm is simply the clustering of the RDKIT-generated conformations. We hope our findings can help the community to revise the deep learning methods for MCG. The code of the proposed algorithm could be found at https://gist.github.com/ZhouGengmo/5b565f51adafcd911c0bc115b2ef027c.
翻译:分子构象生成(MCG)是药物发现中的基础且重要问题。传统方法如系统搜索、模型构建、随机搜索、距离几何、分子动力学、蒙特卡洛方法等已被开发用于解决MCG问题,但它们根据分子结构存在一定局限性。近年来,大量基于深度学习的MCG方法声称其性能显著优于传统方法。然而,令我们惊讶的是,我们设计了一种基于传统方法的简单且低成本的算法(无参数),并发现它在广泛使用的GEOM-QM9与GEOM-Drugs基准测试中与基于深度学习的MCG方法相当甚至更优。具体而言,我们设计的算法仅仅是RDKIT生成构象的聚类。我们希望这一发现能帮助学界重新审视用于MCG的深度学习方法。所提出算法的代码可在 https://gist.github.com/ZhouGengmo/5b565f51adafcd911c0bc115b2ef027c 获取。