Predicting the Kohn-Sham Hamiltonian with machine learning can accelerate density functional theory while retaining access to molecular orbitals, energy levels, and electronic-structure observables that energy-only surrogates cannot resolve. Yet element-wise agreement with the converged Hamiltonian, an implicit fixed point of the self-consistent field iteration, does not determine the occupied subspace that governs orbital energies and densities. Here we present HamEvo, a neural operator that learns the single-step self-consistent update and returns the converged Hamiltonian as its fixed point. HamEvo is pre-trained on intermediate self-consistent trajectories and calibrated at equilibrium with density-matrix supervision. Across benchmarks from MD17 to drug-like QMugs, HamEvo lowers Hamiltonian errors by 35-49% over direct-regression and deep-equilibrium baselines, and predicts QMugs HOMO and LUMO energies with mean absolute errors of 0.036 and 0.053 eV, near the 1 kcal/mol chemical-accuracy scale. Few-shot fine-tuning with only 20 reference conformations extends HamEvo to molecules of up to 122 atoms, well beyond the size range covered by pre-training. With thermal molecular-dynamics sampling, HamEvo captures temperature-dependent HOMO-LUMO gap renormalization beyond the harmonic approximation. Inference is up to 242 times faster than conventional DFT.
翻译:通过机器学习预测Kohn-Sham哈密顿量可在保持分子轨道、能级及电子结构可观测量的同时加速密度泛函理论——而仅基于能量的代理模型无法解析这些量。然而,与收敛哈密顿量(自洽场迭代的隐式不动点)在元素层面的吻合,并不能确定支配轨道能量和密度的占据子空间。本文提出HamEvo,一种学习单步自洽更新并将其收敛哈密顿量作为定点的神经算子。HamEvo在中间自洽轨迹上预训练,并在平衡态通过密度矩阵监督进行校准。在从MD17到类药分子集QMugs的基准测试中,HamEvo较直接回归和深度平衡基线将哈密顿量误差降低35-49%,并以0.036和0.053 eV的平均绝对误差预测QMugs的HOMO和LUMO能量,接近1 kcal/mol的化学精度标准。仅需20个参考构象的少样本微调即可将HamEvo扩展至含122个原子的大分子,远超预训练涵盖的尺寸范围。结合热分子动力学采样,HamEvo可捕捉超越谐振近似的温度相关HOMO-LUMO能隙重整化效应。其推理速度较传统DFT提升高达242倍。