Towards More Transparent and Accurate Cancer Diagnosis with an Unsupervised CAE Approach

Digital pathology has revolutionized cancer diagnosis by leveraging Content-Based Medical Image Retrieval (CBMIR) for analyzing histopathological Whole Slide Images (WSIs). CBMIR enables searching for similar content, enhancing diagnostic reliability and accuracy. In 2020, breast and prostate cancer constituted 11.7% and 14.1% of cases, respectively, as reported by the Global Cancer Observatory (GCO). The proposed Unsupervised CBMIR (UCBMIR) replicates the traditional cancer diagnosis workflow, offering a dependable method to support pathologists in WSI-based diagnostic conclusions. This approach alleviates pathologists' workload, potentially enhancing diagnostic efficiency. To address the challenge of the lack of labeled histopathological images in CBMIR, a customized unsupervised Convolutional Auto Encoder (CAE) was developed, extracting 200 features per image for the search engine component. UCBMIR was evaluated using widely-used numerical techniques in CBMIR, alongside visual evaluation and comparison with a classifier. The validation involved three distinct datasets, with an external evaluation demonstrating its effectiveness. UCBMIR outperformed previous studies, achieving a top 5 recall of 99% and 80% on BreaKHis and SICAPv2, respectively, using the first evaluation technique. Precision rates of 91% and 70% were achieved for BreaKHis and SICAPv2, respectively, using the second evaluation technique. Furthermore, UCBMIR demonstrated the capability to identify various patterns in patches, achieving an 81% accuracy in the top 5 when tested on an external image from Arvaniti.

翻译：数字病理学通过利用基于内容的医学图像检索（CBMIR）分析组织病理学全切片图像（WSI），彻底革新了癌症诊断。CBMIR能够搜索相似内容，提升诊断的可靠性与准确性。根据全球癌症观察站（GCO）报告，2020年乳腺癌和前列腺癌分别占病例总数的11.7%和14.1%。本研究提出的无监督CBMIR（UCBMIR）复现了传统癌症诊断工作流程，为病理学家基于WSI的诊断结论提供了一种可靠方法。该方法减轻了病理学家的工作负荷，有望提升诊断效率。为解决CBMIR中缺乏标注组织病理学图像的难题，我们开发了一种定制化的无监督卷积自编码器（CAE），为搜索引擎组件提取每张图像的200维特征。UCBMIR在CBMIR中采用广泛使用的数值技术进行评估，同时结合视觉评估及与分类器的对比。验证过程涉及三个独立数据集，外部评估证明了其有效性。UCBMIR优于以往研究，在第一项评估技术中，针对BreaKHis和SICAPv2数据集分别实现了99%和80%的top-5召回率；在第二项评估技术中，对BreaKHis和SICAPv2分别达到了91%和70%的精确率。此外，UCBMIR展现了识别病理切片中多种模式的能力，在对Arvaniti外部图像进行测试时，top-5准确率达到81%。