Machine learning (ML) approaches have shown promising results for predicting molecular properties relevant for chemical process design. However, they are often limited by scarce experimental property data and lack thermodynamic consistency. As such, thermodynamics-informed ML, i.e., incorporating thermodynamic relations into the loss function as regularization term for training, has been proposed. We herein transfer the concept of thermodynamics-informed graph neural networks (GNNs) from the Gibbs-Duhem to the Clapeyron equation, predicting several pure component properties in a multi-task manner, namely: vapor pressure, liquid molar volume, vapor molar volume and enthalpy of vaporization. We find improved prediction accuracy of the Clapeyron-GNN compared to the single-task learning setting, and improved approximation of the Clapeyron equation compared to the purely data-driven multi-task learning setting. In fact, we observe the largest improvement in prediction accuracy for the properties with the lowest availability of data, making our model promising for practical application in data scarce scenarios of chemical engineering practice.
翻译:机器学习方法在预测化学过程设计相关的分子性质方面已展现出良好前景。然而,这些方法常受限于稀缺的实验性质数据,且缺乏热力学一致性。为此,研究者提出了热力学知情的机器学习方法,即将热力学关系作为正则化项纳入损失函数进行训练。本文将该思想从吉布斯-杜亥姆方程拓展至克拉珀龙方程,构建了热力学知情的图神经网络,以多任务方式预测纯组分的多项性质:蒸汽压、液体摩尔体积、蒸汽摩尔体积和汽化焓。研究发现,相较于单任务学习,克拉珀龙图神经网络具有更高的预测精度;相较于纯数据驱动的多任务学习,其对克拉珀龙方程的逼近程度更优。值得注意的是,在数据可获得性最低的性质预测任务中,我们观察到最显著的精度提升,这使得该模型在化学工程实践中数据稀缺的场景下具有广阔的应用前景。