Density functional theory (DFT) is a powerful computational method used to obtain physical and chemical properties of materials. In the materials discovery framework, it is often necessary to virtually screen a large and high-dimensional chemical space to find materials with desired properties. However, grid searching a large chemical space with DFT is inefficient due to its high computational cost. We propose an approach utilizing Bayesian optimization (BO) with an artificial neural network kernel to enable smart search. This method leverages the BO algorithm, where the neural network, trained on a limited number of DFT results, determines the most promising regions of the chemical space to explore in subsequent iterations. This approach aims to discover materials with target properties while minimizing the number of DFT calculations required. To demonstrate the effectiveness of this method, we investigated 63 doped graphene quantum dots (GQDs) with sizes ranging from 1 to 2 nm to find the structure with the highest light absorbance. Using time-dependent DFT (TDDFT) only 12 times, we achieved a significant reduction in computational cost, approximately 20% of what would be required for a full grid search, by employing the BO algorithm with a neural network kernel. Considering that TDDFT calculations for a single GQD require about half a day of wall time on high-performance computing nodes, this reduction is substantial. Our approach can be generalized to the discovery of new drugs, chemicals, crystals, and alloys with high-dimensional and large chemical spaces, offering a scalable solution for various applications in materials science.
翻译:密度泛函理论(DFT)是一种用于获取材料物理和化学性质的强大计算方法。在材料发现框架中,通常需要对庞大且高维的化学空间进行虚拟筛选,以寻找具有目标特性的材料。然而,由于DFT计算成本高昂,对大型化学空间进行网格搜索效率低下。我们提出了一种利用人工神经网络核的贝叶斯优化(BO)实现智能搜索的方法。该方法基于BO算法,其中神经网络通过有限数量的DFT结果进行训练,以确定化学空间中最有前景的区域供后续迭代探索。此方法旨在发现具有目标特性的材料,同时最大限度地减少所需的DFT计算次数。为验证该方法的有效性,我们研究了63个尺寸在1至2纳米之间的掺杂石墨烯量子点(GQDs),以寻找具有最高光吸收率的结构。通过仅使用12次含时DFT(TDDFT)计算,并采用神经网络核的BO算法,我们实现了计算成本的大幅降低——约为完整网格搜索所需成本的20%。考虑到单个GQD的TDDFT计算在高性能计算节点上需耗时约半天,这一降低幅度尤为显著。我们的方法可推广至具有高维和大规模化学空间的新药物、化学品、晶体及合金的发现,为材料科学中的多种应用提供可扩展的解决方案。