Diffusion coefficients are key thermophysical properties for modeling mass transport in liquids, but experimental data are scarce, making reliable prediction methods indispensable. In the present work, we introduce a new method for predicting diffusion coefficients of molecular components at infinite dilution in pure liquid solvents by integrating the Stokes-Einstein (SE) equation with machine learning (ML). Unlike previous ML approaches, the resulting hybrid Enhanced Stokes-Einstein (ESE) model provides strictly physically consistent predictions for diffusion coefficients as a function of temperature across a broad range of binary mixtures. Trained and validated using an extensive compilation of literature data for infinite-dilution diffusion coefficients in binary liquid systems, ESE achieves significantly higher prediction accuracies than the previous state-of-the-art model, SEGWE, while requiring only the SMILES strings encoding of the molecular formulae of the components of interest as additional inputs, which are always available. This simplicity makes ESE broadly applicable, e.g., for process design and optimization. The ESE model and its source code are fully disclosed and are directly accessible via an interactive web interface at https://ml-prop.mv.rptu.de/.
翻译:扩散系数是模拟液体中质量传递的关键热物理性质,但实验数据稀缺,使得可靠的预测方法不可或缺。在本工作中,我们通过将斯托克斯-爱因斯坦(SE)方程与机器学习(ML)相结合,提出了一种预测分子组分在纯液体溶剂中无限稀释扩散系数的新方法。与以往的ML方法不同,所得到的混合增强型斯托克斯-爱因斯坦(ESE)模型为二元混合物在宽温度范围内的扩散系数提供了严格物理一致的预测。ESE使用文献中二元液体系统无限稀释扩散系数的大量汇编数据进行训练和验证,其预测精度显著高于先前最先进的模型SEGWE,同时仅需要目标组分分子式的SMILES字符串编码作为额外输入(这些信息总是可获取的)。这种简洁性使得ESE具有广泛适用性,例如用于过程设计与优化。ESE模型及其源代码已完全公开,并可通过交互式网络界面 https://ml-prop.mv.rptu.de/ 直接访问。