Whole Slide Image (WSI) classification with multiple instance learning (MIL) in digital pathology faces significant computational challenges. Current methods mostly rely on extensive self-supervised learning (SSL) for satisfactory performance, requiring long training periods and considerable computational resources. At the same time, no pre-training affects performance due to domain shifts from natural images to WSIs. We introduce \textbf{\textit{Snuffy}} architecture, a novel MIL-pooling method based on sparse transformers that mitigates performance loss with limited pre-training and enables continual few-shot pre-training as a competitive option. Our sparsity pattern is tailored for pathology and is theoretically proven to be a universal approximator with the tightest probabilistic sharp bound on the number of layers for sparse transformers, to date. We demonstrate Snuffy's effectiveness on CAMELYON16 and TCGA Lung cancer datasets, achieving superior WSI and patch-level accuracies. The code is available on \url{https://github.com/jafarinia/snuffy}.
翻译:数字病理学中基于多示例学习(MIL)的全切片图像(WSI)分类面临着重大的计算挑战。现有方法大多依赖广泛的自监督学习(SSL)以获得令人满意的性能,这需要较长的训练周期和大量的计算资源。同时,由于从自然图像到WSI的领域偏移,缺乏预训练会影响性能。我们提出了\textbf{\textit{Snuffy}}架构,这是一种基于稀疏Transformer的新型MIL池化方法,它能在有限预训练条件下减轻性能损失,并使持续少样本预训练成为一种有竞争力的选择。我们的稀疏模式专为病理学定制,并在理论上被证明是迄今为止具有最紧概率尖锐层数界限的稀疏Transformer通用逼近器。我们在CAMELYON16和TCGA肺癌数据集上验证了Snuffy的有效性,实现了优异的WSI和图像块级别分类精度。代码发布于\url{https://github.com/jafarinia/snuffy}。