Due to the high heterogeneity and clinical characteristics of cancer, there are significant differences in multi-omic data and clinical characteristics among different cancer subtypes. Therefore, accurate classification of cancer subtypes can help doctors choose the most appropriate treatment options, improve treatment outcomes, and provide more accurate patient survival predictions. In this study, we propose a supervised multi-head attention mechanism model (SMA) to classify cancer subtypes successfully. The attention mechanism and feature sharing module of the SMA model can successfully learn the global and local feature information of multi-omics data. Second, it enriches the parameters of the model by deeply fusing multi-head attention encoders from Siamese through the fusion module. Validated by extensive experiments, the SMA model achieves the highest accuracy, F1 macroscopic, F1 weighted, and accurate classification of cancer subtypes in simulated, single-cell, and cancer multiomics datasets compared to AE, CNN, and GNN-based models. Therefore, we contribute to future research on multiomics data using our attention-based approach.
翻译:摘要:由于癌症的高度异质性和临床特征,不同癌症亚型在多组学数据和临床特征上存在显著差异。因此,对癌症亚型进行精确分类有助于医生选择最合适的治疗方案,改善治疗效果,并提供更准确的患者生存预测。在本研究中,我们提出了一种有监督的多头注意力机制模型(SMA),成功实现了癌症亚型的分类。SMA模型中的注意力机制和特征共享模块能够有效学习多组学数据的全局和局部特征信息。其次,通过融合模块深度融合孪生网络中的多头注意力编码器,丰富了模型的参数。经大量实验验证,与基于AE、CNN和GNN的模型相比,SMA模型在模拟数据集、单细胞数据集和癌症多组学数据集上均实现了最高准确率、F1宏观值、F1加权值,并对癌症亚型进行了精确分类。因此,我们基于注意力机制的方法为未来多组学数据研究做出了贡献。