Explainability techniques are crucial in gaining insights into the reasons behind the predictions of deep learning models, which have not yet been applied to chemical language models. We propose an explainable AI technique that attributes the importance of individual atoms towards the predictions made by these models. Our method backpropagates the relevance information towards the chemical input string and visualizes the importance of individual atoms. We focus on self-attention Transformers operating on molecular string representations and leverage a pretrained encoder for finetuning. We showcase the method by predicting and visualizing solubility in water and organic solvents. We achieve competitive model performance while obtaining interpretable predictions, which we use to inspect the pretrained model.
翻译:可解释性技术对于理解深度学习模型预测背后的原因至关重要,但目前尚未应用于化学语言模型。我们提出了一种可解释人工智能技术,用于衡量单个原子对这些模型预测的重要性。该方法将相关性信息反向传播至化学输入字符串,并可视化单个原子的重要性。我们聚焦于基于分子字符串表示的自注意力Transformer模型,并利用预训练编码器进行微调。通过预测并可视化水和有机溶剂中的溶解度,我们展示了该方法。我们在获得可解释预测的同时实现了具有竞争力的模型性能,并将其用于检查预训练模型。