Active learning (AL) plays a critical role in materials science, enabling applications such as the construction of machine-learning interatomic potentials for atomistic simulations and the operation of self-driving laboratories. Despite its widespread use, the reliability and effectiveness of AL workflows depend on implicit design assumptions that are rarely examined systematically. Here, we critically assess AL workflows deployed in materials science and investigate how key design choices, such as surrogate models, sampling strategies, uncertainty quantification and evaluation metrics, relate to their performance. By identifying common pitfalls and discussing practical mitigation strategies, we provide guidance to practitioners for the efficient design, assessment, and interpretation of AL workflows in materials science.
翻译:主动学习(AL)在材料科学中发挥着关键作用,其应用包括构建用于原子尺度模拟的机器学习原子间势函数以及运行自主驾驶实验室。尽管其应用广泛,但主动学习工作流程的可靠性与有效性依赖于极少被系统审视的隐含设计假设。本文对材料科学中部署的主动学习工作流程进行了批判性评估,并研究了关键设计选择(如代理模型、采样策略、不确定性量化和评估指标)如何影响其性能。通过识别常见缺陷并讨论实用的缓解策略,我们为实践者提供了在材料科学中高效设计、评估和解释主动学习工作流程的指导。