Recently, there has been a growing interest in applying machine learning methods to problems in engineering mechanics. In particular, there has been significant interest in applying deep learning techniques to predicting the mechanical behavior of heterogeneous materials and structures. Researchers have shown that deep learning methods are able to effectively predict mechanical behavior with low error for systems ranging from engineered composites, to geometrically complex metamaterials, to heterogeneous biological tissue. However, there has been comparatively little attention paid to deep learning model calibration, i.e., the match between predicted probabilities of outcomes and the true probabilities of outcomes. In this work, we perform a comprehensive investigation into ML model calibration across seven open access engineering mechanics datasets that cover three distinct types of mechanical problems. Specifically, we evaluate both model and model calibration error for multiple machine learning methods, and investigate the influence of ensemble averaging and post hoc model calibration via temperature scaling. Overall, we find that ensemble averaging of deep neural networks is both an effective and consistent tool for improving model calibration, while temperature scaling has comparatively limited benefits. Looking forward, we anticipate that this investigation will lay the foundation for future work in developing mechanics specific approaches to deep learning model calibration.
翻译:近年来,机器学习方法在工程力学问题中的应用日益受到关注,特别是深度学习技术在异质材料与结构力学行为预测领域展现出显著潜力。研究表明,深度学习模型能够以较低误差有效预测从工程复合材料、几何复杂超材料到异质生物组织等系统的力学行为。然而,针对深度学习模型校准的研究相对不足——即模型预测概率与真实概率之间的匹配关系。本文基于涵盖三类不同力学问题的七个开放工程力学数据集,系统研究了机器学习模型的校准特性。具体而言,我们评估了多种机器学习方法的模型误差与校准误差,并探究了集成平均与基于温度缩放的后期校准策略的影响。总体而言,深度神经网络集成平均是提升模型校准能力既有效又稳定的方法,而温度缩放的改进效果相对有限。展望未来,本研究有望为发展面向力学的深度学习模型校准方法奠定基础。