Tumour heterogeneity in breast cancer poses challenges in predicting outcome and response to therapy. Spatial transcriptomics technologies may address these challenges, as they provide a wealth of information about gene expression at the cell level, but they are expensive, hindering their use in large-scale clinical oncology studies. Predicting gene expression from hematoxylin and eosin stained histology images provides a more affordable alternative for such studies. Here we present BrST-Net, a deep learning framework for predicting gene expression from histopathology images using spatial transcriptomics data. Using this framework, we trained and evaluated 10 state-of-the-art deep learning models without utilizing pretrained weights for the prediction of 250 genes. To enhance the generalisation performance of the main network, we introduce an auxiliary network into the framework. Our methodology outperforms previous studies, with 237 genes identified with positive correlation, including 24 genes with a median correlation coefficient greater than 0.50. This is a notable improvement over previous studies, which could predict only 102 genes with positive correlation, with the highest correlation values ranging from 0.29 to 0.34.
翻译:乳腺肿瘤的异质性对预测治疗结果和应答反应构成挑战。空间转录组技术可通过提供细胞水平丰富的基因表达信息应对这些挑战,但其高昂成本限制了在大规模临床肿瘤学研究中的应用。从苏木精-伊红染色组织学图像预测基因表达为这类研究提供了更经济的替代方案。本文提出BrST-Net——一种利用空间转录组数据从组织病理学图像预测基因表达的深度学习框架。基于该框架,我们训练并评估了10种未使用预训练权重的先进深度学习模型,用于250个基因的预测。为提升主网络的泛化性能,我们在框架中引入辅助网络。本方法优于以往研究:237个基因呈现正相关,其中24个基因中位相关系数大于0.50。与既往仅能预测102个正相关基因、且最高相关系数介于0.29~0.34的研究相比,这是显著进步。