Predicting chemical reactions, a fundamental challenge in chemistry, involves forecasting the resulting products from a given reaction process. Conventional techniques, notably those employing Graph Neural Networks (GNNs), are often limited by insufficient training data and their inability to utilize textual information, undermining their applicability in real-world applications. In this work, we propose ReLM, a novel framework that leverages the chemical knowledge encoded in language models (LMs) to assist GNNs, thereby enhancing the accuracy of real-world chemical reaction predictions. To further enhance the model's robustness and interpretability, we incorporate the confidence score strategy, enabling the LMs to self-assess the reliability of their predictions. Our experimental results demonstrate that ReLM improves the performance of state-of-the-art GNN-based methods across various chemical reaction datasets, especially in out-of-distribution settings. Codes are available at https://github.com/syr-cn/ReLM.
翻译:化学反应预测是化学领域的一项基础挑战,涉及预测给定反应过程中生成的产物。传统方法(特别是基于图神经网络(GNN)的方法)常受限于训练数据不足以及无法利用文本信息,这限制了其在实际应用中的适用性。本文提出ReLM——一种新颖框架,通过利用语言模型(LMs)中编码的化学知识辅助GNN,从而提升真实场景下化学反应预测的准确性。为进一步增强模型的鲁棒性和可解释性,我们引入置信度评分策略,使语言模型能够自我评估其预测的可靠性。实验结果表明,ReLM在多种化学反应数据集上(尤其在分布外场景下)提升了现有最先进的GNN方法的性能。代码已开源:https://github.com/syr-cn/ReLM。