Weakly-supervised classification of histopathology slides is a computationally intensive task, with a typical whole slide image (WSI) containing billions of pixels to process. We propose Discriminative Region Active Sampling for Multiple Instance Learning (DRAS-MIL), a computationally efficient slide classification method using attention scores to focus sampling on highly discriminative regions. We apply this to the diagnosis of ovarian cancer histological subtypes, which is an essential part of the patient care pathway as different subtypes have different genetic and molecular profiles, treatment options, and patient outcomes. We use a dataset of 714 WSIs acquired from 147 epithelial ovarian cancer patients at Leeds Teaching Hospitals NHS Trust to distinguish the most common subtype, high-grade serous carcinoma, from the other four subtypes (low-grade serous, endometrioid, clear cell, and mucinous carcinomas) combined. We demonstrate that DRAS-MIL can achieve similar classification performance to exhaustive slide analysis, with a 3-fold cross-validated AUC of 0.8679 compared to 0.8781 with standard attention-based MIL classification. Our approach uses at most 18% as much memory as the standard approach, while taking 33% of the time when evaluating on a GPU and only 14% on a CPU alone. Reducing prediction time and memory requirements may benefit clinical deployment and the democratisation of AI, reducing the extent to which computational hardware limits end-user adoption.
翻译:弱监督的组织病理学切片分类是一项计算密集型任务,典型的全切片图像(WSI)包含数十亿像素需处理。我们提出基于判别区域主动采样的多实例学习(DRAS-MIL),这是一种利用注意力得分聚焦于高判别性区域采样的高效切片分类方法。将该方法应用于卵巢癌组织学亚型诊断——这是患者诊疗路径中的关键环节,因不同亚型具有不同的遗传与分子特征、治疗方案及患者预后。我们采用来自利兹教学医院NHS信托基金147例上皮性卵巢癌患者的714张WSI数据集,用于区分最常见的高级别浆液性癌与其他四种亚型(低级别浆液性、子宫内膜样、透明细胞及黏液性腺癌)的组合。实验表明,DRAS-MIL可实现与全切片分析相当的分类性能,3折交叉验证AUC为0.8679,而标准注意力MIL分类为0.8781。该方法在GPU上评估时内存占用最高仅为标准方法的18%,耗时仅为33%;在CPU上则仅占14%。降低预测时间与内存需求有助于临床部署和AI的民主化,减少计算硬件对终端用户采纳程度的限制。