Machine learning (ML) plays an important role in quantum chemistry, providing fast-to-evaluate predictive models for various properties of molecules. However, as most existing ML models for molecular electronic properties use density function theory (DFT) databases as the ground truth in training, their prediction accuracy cannot go beyond the DFT. In this work, we developed a unified ML method for electronic structures of organic molecules using the gold-standard CCSD(T) calculations as training data. Tested on hydrocarbon molecules, our model outperforms the DFT with the widely-used B3LYP functional in both computation costs and prediction accuracy of various quantum chemical properties. We apply the model to aromatic compounds and semiconducting polymers on both ground state and excited state properties, demonstrating its accuracy and generalization capability to complex systems that are hard to calculate using CCSD(T)-level methods.
翻译:机器学习在量子化学中发挥着重要作用,为分子多种性质提供快速评估的预测模型。然而,由于现有分子电子性质机器学习模型大多采用密度泛函理论数据库作为训练基准,其预测精度无法超越密度泛函理论。本研究采用黄金标准的CCSD(T)计算作为训练数据,开发了一种统一的有机分子电子结构机器学习方法。经碳氢化合物分子测试,我们的模型在计算成本和多种量子化学性质预测精度上均超越了广泛使用的B3LYP泛函密度泛函理论。我们将该模型应用于基态和激发态性质的芳香化合物及半导体聚合物,证明了其在难以采用CCSD(T)级别方法计算的复杂体系中的精度和泛化能力。