The advancement of Spatial Transcriptomics (ST) has facilitated the spatially-aware profiling of gene expressions based on histopathology images. Although ST data offers valuable insights into the micro-environment of tumors, its acquisition cost remains expensive. Therefore, directly predicting the ST expressions from digital pathology images is desired. Current methods usually adopt existing regression backbones for this task, which ignore the inherent multi-scale hierarchical data structure of digital pathology images. To address this limit, we propose M2ORT, a many-to-one regression Transformer that can accommodate the hierarchical structure of the pathology images through a decoupled multi-scale feature extractor. Different from traditional models that are trained with one-to-one image-label pairs, M2ORT accepts multiple pathology images of different magnifications at a time to jointly predict the gene expressions at their corresponding common ST spot, aiming at learning a many-to-one relationship through training. We have tested M2ORT on three public ST datasets and the experimental results show that M2ORT can achieve state-of-the-art performance with fewer parameters and floating-point operations (FLOPs). The code is available at: https://github.com/Dootmaan/M2ORT/.
翻译:空间转录组学(ST)的进步促进了基于组织病理学图像的基因表达空间感知分析。尽管ST数据为肿瘤微环境提供了宝贵见解,但其获取成本仍然高昂。因此,直接从数字病理图像预测ST表达成为迫切需求。现有方法通常采用现成的回归骨干网络处理该任务,却忽略了数字病理图像固有的多尺度层次化数据结构。为解决这一局限,我们提出M2ORT——一种通过解耦多尺度特征提取器适配病理图像层次化结构的"多对一"回归Transformer。与传统的一对一图像-标签对训练模型不同,M2ORT可同时输入不同放大倍率的多个病理图像,联合预测对应共同ST位点的基因表达,旨在通过训练学习"多对一"映射关系。我们在三个公开ST数据集上测试M2ORT,实验结果表明,该模型能以更少的参数和浮点运算量(FLOPs)实现最先进的性能。代码开源于:https://github.com/Dootmaan/M2ORT/。