Weakly-supervised classification of histopathology slides is a computationally intensive task, with a typical whole slide image (WSI) containing billions of pixels to process. We propose Discriminative Region Active Sampling for Multiple Instance Learning (DRAS-MIL), a computationally efficient slide classification method using attention scores to focus sampling on highly discriminative regions. We apply this to the diagnosis of ovarian cancer histological subtypes, which is an essential part of the patient care pathway as different subtypes have different genetic and molecular profiles, treatment options, and patient outcomes. We use a dataset of 714 WSIs acquired from 147 epithelial ovarian cancer patients at Leeds Teaching Hospitals NHS Trust to distinguish the most common subtype, high-grade serous carcinoma, from the other four subtypes (low-grade serous, endometrioid, clear cell, and mucinous carcinomas) combined. We demonstrate that DRAS-MIL can achieve similar classification performance to exhaustive slide analysis, with a 3-fold cross-validated AUC of 0.8679 compared to 0.8781 with standard attention-based MIL classification. Our approach uses at most 18% as much memory as the standard approach, while taking 33% of the time when evaluating on a GPU and only 14% on a CPU alone. Reducing prediction time and memory requirements may benefit clinical deployment and the democratisation of AI, reducing the extent to which computational hardware limits end-user adoption.
翻译:组织病理切片的弱监督分类是一项计算密集型任务,典型全切片图像(WSI)包含数十亿像素需处理。我们提出面向多实例学习的判别区域主动采样方法(DRAS-MIL),这是一种利用注意力分数聚焦于高判别区域采样的计算高效切片分类方法。将此方法应用于卵巢癌组织学亚型诊断——不同亚型具有不同的基因与分子特征、治疗方案及患者预后,因此该诊断是患者诊疗路径的关键环节。我们采用由利兹教学医院NHS信托基金147例上皮性卵巢癌患者中获取的714张WSI数据集,将最常见亚型(高级别浆液性癌)与其余四种亚型(低级别浆液性、子宫内膜样、透明细胞及黏液性癌)进行区分。实验表明,DRAS-MIL可达到与全切片分析相当的分类性能:基于注意力机制的标准MIL分类的三折交叉验证AUC值为0.8781,而本方法达0.8679。本方法最多仅需标准方法18%的内存,在GPU上评估时仅占用33%的处理时间,在仅使用CPU时更仅需14%的时间。减少预测时间与内存需求有利于临床部署和人工智能的民主化,降低计算硬件对终端用户采纳程度的限制。