Accurately predicting vapor pressure is vital for various industrial and environmental applications. However, obtaining accurate measurements for all compounds of interest is not possible due to the resource and labor intensity of experiments. The demand for resources and labor further multiplies when a temperature-dependent relationship for predicting vapor pressure is desired. In this paper, we propose PUFFIN (Path-Unifying Feed-Forward Interfaced Network), a machine learning framework that combines transfer learning with a new inductive bias node inspired by domain knowledge (the Antoine equation) to improve vapor pressure prediction. By leveraging inductive bias and transfer learning using graph embeddings, PUFFIN outperforms alternative strategies that do not use inductive bias or that use generic descriptors of compounds. The framework's incorporation of domain-specific knowledge to overcome the limitation of poor data availability shows its potential for broader applications in chemical compound analysis, including the prediction of other physicochemical properties. Importantly, our proposed machine learning framework is partially interpretable, because the inductive Antoine node yields network-derived Antoine equation coefficients. It would then be possible to directly incorporate the obtained analytical expression in process design software for better prediction and control of processes occurring in industry and the environment.
翻译:精确预测蒸汽压对工业及环境应用至关重要。然而,由于实验所需的资源与人力成本高昂,无法对所有目标化合物进行准确测量。当需要建立温度依赖关系预测蒸汽压时,资源与人力需求更是成倍增长。本文提出PUFFIN(路径统一前馈接口网络),这是一种结合迁移学习与基于领域知识(安托万方程)新型归纳偏置节点的机器学习框架,用于提升蒸汽压预测性能。通过利用归纳偏置和基于图嵌入的迁移学习,PUFFIN在性能上优于未使用归纳偏置或仅采用化合物通用描述符的替代策略。该框架融合领域知识以克服数据匮乏局限的特性,展现了其在化合物分析(包括其他物理化学性质预测)中更广泛的应用潜力。值得强调的是,我们提出的机器学习框架具有部分可解释性,因为归纳安托万节点可生成网络推导的安托万方程系数。这使得所得解析表达式可直接集成至过程设计软件,从而更精准地预测与控制工业及环境中的各类过程。