Adversarial approaches, which intentionally challenge machine learning models by generating difficult examples, are increasingly being adopted to improve machine learning interatomic potentials (MLIPs). While already providing great practical value, little is known about the actual prediction errors of MLIPs on adversarial structures and whether these errors can be controlled. We propose the Calibrated Adversarial Geometry Optimization (CAGO) algorithm to discover adversarial structures with user-assigned errors. Through uncertainty calibration, the estimated uncertainty of MLIPs is unified with real errors. By performing geometry optimization for calibrated uncertainty, we reach adversarial structures with the user-assigned target MLIP prediction error. Integrating with active learning pipelines, we benchmark CAGO, demonstrating stable MLIPs that systematically converge structural, dynamical, and thermodynamical properties for liquid water and water adsorption in a metal-organic framework within only hundreds of training structures, where previously many thousands were typically required.
翻译:对抗性方法通过生成具有挑战性的样本来有意地测试机器学习模型,正日益被用于改进机器学习原子间势(MLIPs)。尽管已展现出巨大的实用价值,但关于MLIPs在对抗性结构上的实际预测误差以及这些误差是否可控,目前仍知之甚少。我们提出了校准对抗几何优化(CAGO)算法,用于发现具有用户指定误差的对抗性结构。通过不确定性校准,MLIPs的估计不确定性与实际误差得以统一。通过对校准后的不确定性进行几何优化,我们获得了达到用户指定目标MLIP预测误差的对抗性结构。结合主动学习流程,我们对CAGO进行了基准测试,结果表明:仅需数百个训练结构(而以往通常需要数千个),即可获得稳定的MLIPs,并系统性地收敛液态水以及金属有机框架中水吸附的结构、动力学和热力学性质。