Physics-Informed Attention Mechanism and Generalization Capability of Deep Learning-Based Grain Growth Evolution Prediction

Machine Learning (ML) models for grain growth prediction are typically trained on idealized synthetic data, yet practical applications require generalization to conditions outside the training distribution. This study evaluated the Out-Of-Distribution (OOD) generalization capability of the trained model from our previous study across three test cases, including experimental microstructures, microstructures characterized by a bimodal grain size distribution, and abnormal grain growth. To further probe whether physics-informed architectural design could improve robustness under these different conditions, a boundary-masked attention mechanism was proposed specifically for grain growth, constraining attention to grain boundary pixels. Both the baseline and the proposed physics-informed attention model were evaluated without retraining or fine-tuning on the OOD data. Both models successfully generalized to all three test cases, yet the boundary-masked attention mechanism provided substantial improvements, with the most notable gains for microstructures characterized by a bimodal grain size distribution, where Structural Similarity Index Measure (SSIM) improved from \num{0.6221} to \num{0.7609} and mean grain size ($\overline{R}$) error decreased from \SI{8.75}{\percent} to \SI{3.57}{\percent}. The attention heatmap analysis revealed that the boundary-masked attention model learned to concentrate attention on large grain boundaries in a manner consistent with curvature-driven grain growth physics, emerging from training without being explicitly encoded into the architecture. These results indicate that models trained on synthetic data can generalize to diverse OOD conditions without retraining, and that physics-informed attention may improve accuracy when the boundary morphology matches the training domain.

翻译：用于晶粒生长预测的机器学习模型通常基于理想化合成数据训练，但实际应用要求模型能泛化至训练分布之外的场景。本研究从三个测试案例评估了我们前期工作中训练模型的分布外泛化能力，包括实验微观结构、双峰晶粒尺寸分布的微观结构以及异常晶粒生长。为探讨融入物理信息的架构设计能否提升不同条件下的鲁棒性，我们专门针对晶粒生长提出边界掩码注意力机制，将注意力约束于晶界像素。基线模型与所提基于物理信息的注意力模型均未经过重训练或微调，直接应用于分布外数据。两种模型均成功泛化至所有三个测试案例，但边界掩码注意力机制带来显著改进，其中双峰晶粒尺寸分布微观结构的结构相似性指数从\num{0.6221}提升至\num{0.7609}，平均晶粒尺寸（$\overline{R}$）误差从\SI{8.75}{\percent}降至\SI{3.57}{\percent}。注意力热图分析表明，边界掩码注意力模型学会将注意力集中于大晶界，其方式与曲率驱动晶粒生长物理学规律一致，且该能力源于训练过程而非显式编码至架构中。这些结果表明，基于合成数据训练的模型无需重训练即可泛化至多种分布外条件，且当晶界形态与训练域匹配时，基于物理信息的注意力机制可提升预测精度。