Mass spectra, which are agglomerations of ionized fragments from targeted molecules, play a crucial role across various fields for the identification of molecular structures. A prevalent analysis method involves spectral library searches,where unknown spectra are cross-referenced with a database. The effectiveness of such search-based approaches, however, is restricted by the scope of the existing mass spectra database, underscoring the need to expand the database via mass spectra prediction. In this research, we propose the Motif-based Mass Spectrum Prediction Network (MoMS-Net), a system that predicts mass spectra using the information derived from structural motifs and the implementation of Graph Neural Networks (GNNs). We have tested our model across diverse mass spectra and have observed its superiority over other existing models. MoMS-Net considers substructure at the graph level, which facilitates the incorporation of long-range dependencies while using less memory compared to the graph transformer model.
翻译:质谱是目标分子离子化碎片的集合,在分子结构鉴定等各个领域发挥着关键作用。常见分析方法涉及谱库检索——将未知质谱与数据库进行交叉比对。然而,这类搜索方法的有效性受限于现有质谱数据库的覆盖范围,因此亟需通过质谱预测来扩展数据库。在本研究中,我们提出了基于基序的质谱预测网络(MoMS-Net),该系统利用结构基序信息和图神经网络(GNNs)实现质谱预测。我们基于多种质谱数据对模型进行了测试,结果表明该模型优于现有其他模型。MoMS-Net在图级别关注子结构,这使得在相比图变压器模型使用更少内存的同时,能有效整合长程依赖关系。