The cancer prognosis on gigapixel Whole-Slide Images (WSIs) has always been a challenging task. To further enhance WSI visual representations, existing methods have explored image pyramids, instead of single-resolution images, in WSIs. In spite of this, they still face two major problems: high computational cost and the unnoticed semantical gap in multi-resolution feature fusion. To tackle these problems, this paper proposes to efficiently exploit WSI pyramids from a new perspective, the dual-stream network with cross-attention (DSCA). Our key idea is to utilize two sub-streams to process the WSI patches with two resolutions, where a square pooling is devised in a high-resolution stream to significantly reduce computational costs, and a cross-attention-based method is proposed to properly handle the fusion of dual-stream features. We validate our DSCA on three publicly-available datasets with a total number of 3,101 WSIs from 1,911 patients. Our experiments and ablation studies verify that (i) the proposed DSCA could outperform existing state-of-the-art methods in cancer prognosis, by an average C-Index improvement of around 4.6%; (ii) our DSCA network is more efficient in computation -- it has more learnable parameters (6.31M vs. 860.18K) but less computational costs (2.51G vs. 4.94G), compared to a typical existing multi-resolution network. (iii) the key components of DSCA, dual-stream and cross-attention, indeed contribute to our model's performance, gaining an average C-Index rise of around 2.0% while maintaining a relatively-small computational load. Our DSCA could serve as an alternative and effective tool for WSI-based cancer prognosis.
翻译:在千兆像素全切片图像(WSI)上进行癌症预后始终是一项具有挑战性的任务。为进一步增强WSI的视觉表征,现有方法已探索利用WSI中的图像金字塔而非单一分辨率图像。尽管如此,这些方法仍面临两大问题:高计算成本以及多分辨率特征融合中未被注意的语义鸿沟。为解决这些问题,本文提出从新视角高效利用WSI金字塔——双流交叉注意力网络(DSCA)。我们的核心思想是利用两个子流处理两种分辨率的WSI补丁,其中在高分辨率流中设计了一种方形池化方法以显著降低计算成本,并提出了基于交叉注意力的方法以妥善处理双流特征的融合。我们在三个公开数据集上验证了DSCA,这些数据集共包含来自1,911名患者的3,101张WSI。我们的实验与消融研究证实:(i)所提出的DSCA在癌症预后中平均C-Index提升约4.6%,优于现有最优方法;(ii)与典型现有多分辨率网络相比,DSCA网络在计算效率上更优——虽然可学习参数更多(6.31M vs. 860.18K),但计算成本更低(2.51G vs. 4.94G);(iii)DSCA的关键组件——双流与交叉注意力——确实对模型性能有所贡献,在保持相对较小计算负载的同时平均C-Index提升了约2.0%。我们的DSCA可作为基于WSI的癌症预后的一种有效替代工具。