Recently, pathological diagnosis has achieved superior performance by combining deep learning models with the multiple instance learning (MIL) framework using whole slide images (WSIs). However, the giga-pixeled nature of WSIs poses a great challenge for efficient MIL. Existing studies either do not consider global dependencies among instances, or use approximations such as linear attentions to model the pair-to-pair instance interactions, which inevitably brings performance bottlenecks. To tackle this challenge, we propose a framework named MamMIL for WSI analysis by cooperating the selective structured state space model (i.e., Mamba) with MIL, enabling the modeling of global instance dependencies while maintaining linear complexity. Specifically, considering the irregularity of the tissue regions in WSIs, we represent each WSI as an undirected graph. To address the problem that Mamba can only process 1D sequences, we further propose a topology-aware scanning mechanism to serialize the WSI graphs while preserving the topological relationships among the instances. Finally, in order to further perceive the topological structures among the instances and incorporate short-range feature interactions, we propose an instance aggregation block based on graph neural networks. Experiments show that MamMIL can achieve advanced performance than the state-of-the-art frameworks. The code can be accessed at https://github.com/Vison307/MamMIL.
翻译:近年来,病理诊断通过将深度学习模型与整张切片图像(WSI)的多示例学习(MIL)框架相结合,取得了卓越的性能。然而,WSI的十亿像素级特性对高效的MIL提出了巨大挑战。现有研究要么未考虑实例间的全局依赖关系,要么使用线性注意力等近似方法来建模实例间的成对交互,这不可避免地带来了性能瓶颈。为应对这一挑战,我们提出了一个名为MamMIL的WSI分析框架,通过将选择性结构化状态空间模型(即Mamba)与MIL相结合,能够在保持线性复杂度的同时建模全局实例依赖关系。具体而言,考虑到WSI中组织区域的不规则性,我们将每张WSI表示为一个无向图。针对Mamba只能处理一维序列的问题,我们进一步提出了一种拓扑感知扫描机制,用于将WSI图序列化,同时保留实例间的拓扑关系。最后,为了进一步感知实例间的拓扑结构并融入短程特征交互,我们提出了一种基于图神经网络的实例聚合模块。实验表明,MamMIL能够取得优于现有先进框架的性能。代码可通过 https://github.com/Vison307/MamMIL 访问。