Estimating reaction rates and chemical stability is fundamental, yet efficient methods for large-scale simulations remain out of reach despite advances in modeling and exascale computing. Direct simulation is limited by short timescales; machine-learned potentials require large data sets and struggle with transition state regions essential for reaction rates. Reaction network exploration with sufficient accuracy is hampered by the computational cost of electronic structure calculations, and even simplifications like harmonic transition state theory rely on prohibitively expensive saddle point searches. Surrogate model-based acceleration has been promising but hampered by overhead and numerical instability. This dissertation presents a holistic solution, co-designing physical representations, statistical models, and systems architecture in the Optimal Transport Gaussian Process (OT-GP) framework. Using physics-aware optimal transport metrics, OT-GP creates compact, chemically relevant surrogates of the potential energy surface, underpinned by statistically robust sampling. Alongside EON software rewrites for long timescale simulations, we introduce reinforcement learning approaches for both minimum-mode following (when the final state is unknown) and nudged elastic band methods (when endpoints are specified). Collectively, these advances establish a representation-first, modular approach to chemical kinetics simulation. Large-scale benchmarks and Bayesian hierarchical validation demonstrate state-of-the-art performance and practical exploration of chemical kinetics, transforming a longstanding theoretical promise into a working engine for discovery.
翻译:反应速率与化学稳定性的估算是基础科学问题,然而尽管建模与百亿亿次计算技术取得进展,大规模模拟的高效方法仍难以实现。直接模拟受限于短时间尺度;基于机器学习的势函数需要大规模数据集,且在反应速率关键所在的过渡态区域存在困难。反应网络探索的足够精度受限于电子结构计算的高昂成本,即便采用简化的谐波过渡态理论也依赖于计算代价极高的鞍点搜索。基于代理模型的加速方法前景广阔,但一直受限于计算开销与数值不稳定性。本论文提出一种整体解决方案,在最优传输高斯过程(OT-GP)框架内协同设计物理表示、统计模型与系统架构。OT-GP利用物理感知的最优传输度量,构建势能面的紧凑且化学相关的代理模型,并以统计稳健的采样为理论基础。结合为长时间尺度模拟重写的EON软件,我们引入强化学习方法分别用于最小模式追踪(当终态未知时)和推弹性能带方法(当端点明确时)。这些进展共同建立了一种以表示为先、模块化的化学动力学模拟方法。大规模基准测试与贝叶斯分层验证证明了该方法在化学动力学探索方面达到国际领先性能并具备实际应用能力,将长期的理论承诺转化为可实际运作的科学发现引擎。