The advancement of Spatial Transcriptomics (ST) has facilitated the spatially-aware profiling of gene expressions based on histopathology images. Although ST data offers valuable insights into the micro-environment of tumors, its acquisition cost remains expensive. Therefore, directly predicting the ST expressions from digital pathology images is desired. Current methods usually adopt existing regression backbones for this task, which ignore the inherent multi-scale hierarchical data structure of digital pathology images. To address this limit, we propose M2ORT, a many-to-one regression Transformer that can accommodate the hierarchical structure of the pathology images through a decoupled multi-scale feature extractor. Different from traditional models that are trained with one-to-one image-label pairs, M2ORT accepts multiple pathology images of different magnifications at a time to jointly predict the gene expressions at their corresponding common ST spot, aiming at learning a many-to-one relationship through training. We have tested M2ORT on three public ST datasets and the experimental results show that M2ORT can achieve state-of-the-art performance with fewer parameters and floating-point operations (FLOPs). The code is available at: https://github.com/Dootmaan/M2ORT/.
翻译:空间转录组学(ST)的进展促进了基于组织病理图像的基因表达空间感知分析。尽管ST数据为肿瘤微环境提供了宝贵见解,但其获取成本仍然昂贵。因此,直接从数字病理图像预测ST表达成为迫切需求。现有方法通常采用标准回归主干网络处理该任务,却忽略了数字病理图像固有的多尺度层级数据结构。为突破这一局限,我们提出M2ORT——一种多对一回归Transformer,通过解耦式多尺度特征提取器适配病理图像的层级结构。不同于传统模型采用的一对一图像-标签训练模式,M2ORT可同时接收不同放大倍率的病理图像,联合预测其对应共同ST位点的基因表达,旨在通过训练学习多对一映射关系。我们在三个公开ST数据集上测试M2ORT,实验结果表明,该模型能以更少的参数和浮点运算量(FLOPs)实现最先进的性能。代码已开源:https://github.com/Dootmaan/M2ORT/。