Whole Slide Images (WSIs) present a challenging computer vision task due to their gigapixel size and presence of numerous artefacts. Yet they are a valuable resource for patient diagnosis and stratification, often representing the gold standard for diagnostic tasks. Real-world clinical datasets tend to come as sets of heterogeneous WSIs with labels present at the patient-level, with poor to no annotations. Weakly supervised attention-based multiple instance learning approaches have been developed in recent years to address these challenges, but can fail to resolve both long and short-range dependencies. Here we propose an end-to-end multi-stain self-attention graph (MUSTANG) multiple instance learning pipeline, which is designed to solve a weakly-supervised gigapixel multi-image classification task, where the label is assigned at the patient-level, but no slide-level labels or region annotations are available. The pipeline uses a self-attention based approach by restricting the operations to a highly sparse k-Nearest Neighbour Graph of embedded WSI patches based on the Euclidean distance. We show this approach achieves a state-of-the-art F1-score/AUC of 0.89/0.92, outperforming the widely used CLAM model. Our approach is highly modular and can easily be modified to suit different clinical datasets, as it only requires a patient-level label without annotations and accepts WSI sets of different sizes, as the graphs can be of varying sizes and structures. The source code can be found at https://github.com/AmayaGS/MUSTANG.
翻译:摘要:全切片图像因其千兆像素量级和大量伪影的存在,给计算机视觉任务带来了挑战。然而,它们对于患者诊断与分层而言是宝贵的资源,常被视为诊断任务的金标准。现实临床数据集通常以异质性全切片图像集合的形式呈现,标签仅存在于患者层面,且缺乏甚至完全没有标注。近年来,弱监督注意力机制的多实例学习方法已被开发用于应对这些挑战,但可能无法同时解决长程和短程依赖问题。本文提出了一种端到端的多染色自注意力图(MUSTANG)多实例学习流程,旨在解决弱监督的千兆像素多图像分类任务——其中标签仅在患者级别标注,无切片级别标签或区域标注可用。该流程采用基于自注意力的方法,通过将操作限制在基于欧氏距离的稀疏k近邻图结构中(该图由嵌入的全切片图像块构成)进行实现。研究表明,本方法在F1分数/AUC上达到0.89/0.92的先进水平,优于广泛使用的CLAM模型。该流程具有高度模块化特性,仅需患者级别标签而不需要标注,可接受不同尺寸的全切片图像集合(因图结构可具有可变大小与拓扑),易于适配不同临床数据集。源代码可从 https://github.com/AmayaGS/MUSTANG 获取。