One of the significant steps in the process leading to the identification of proteins is mass spectrometry, which allows for obtaining information about the structure of proteins. Removing isotope peaks from the mass spectrum is vital and it is done in a process called deisotoping. There are different algorithms for deisotoping, but they have their limitations, they are dedicated to different methods of mass spectrometry. Data from experiments performed with the MALDI-ToF technique are characterized by high dimensionality. This paper presents a method for identifying isotope envelopes in MALDI-ToF molecular imaging data based on the Mamdani-Assilan fuzzy system and spatial maps of the molecular distribution of peaks included in the isotopic envelope. Several image texture measures were used to evaluate spatial molecular distribution maps. The algorithm was tested on eight datasets obtained from the MALDI-ToF experiment on samples from the National Institute of Oncology in Gliwice from patients with cancer of the head and neck region. The data were subjected to pre-processing and feature extraction. The results were collected and compared with three existing deisotoping algorithms. The analysis of the obtained results showed that the method for identifying isotopic envelopes proposed in this paper enables the detection of overlapping envelopes by using the approach oriented to study peak pairs. Moreover, the proposed algorithm enables the analysis of large data sets.
翻译:在蛋白质鉴定过程中的重要步骤之一是质谱分析,该方法可获得关于蛋白质结构的信息。从质谱中去除同位素峰至关重要,这一过程称为去同位素化。目前存在多种去同位素化算法,但这些算法各有局限性,且适用于不同的质谱分析方法。采用MALDI-ToF技术进行的实验数据具有高维度特征。本文提出一种基于Mamdani-Assilan模糊系统及同位素包膜内峰分子分布空间图的MALDI-ToF分子成像数据同位素包膜识别方法。研究采用多种图像纹理度量对空间分子分布图进行评估。该算法在来自格利维采国家肿瘤研究所头颈癌患者样本的八组MALDI-ToF实验数据集上进行了测试。数据经过预处理和特征提取后,将结果与三种现有去同位素化算法进行对比分析。结果显示,本文提出的同位素包膜识别方法通过采用面向峰对分析的研究策略,能够有效检测重叠包膜。此外,该算法还支持大规模数据集的分析。