Multi-modal Learning has attracted widespread attention in medical image analysis. Using multi-modal data, whole slide images (WSIs) and clinical information, can improve the performance of deep learning models in the diagnosis of axillary lymph node metastasis. However, clinical information is not easy to collect in clinical practice due to privacy concerns, limited resources, lack of interoperability, etc. Although patient selection can ensure the training set to have multi-modal data for model development, missing modality of clinical information can appear during test. This normally leads to performance degradation, which limits the use of multi-modal models in the clinic. To alleviate this problem, we propose a bidirectional distillation framework consisting of a multi-modal branch and a single-modal branch. The single-modal branch acquires the complete multi-modal knowledge from the multi-modal branch, while the multi-modal learns the robust features of WSI from the single-modal. We conduct experiments on a public dataset of Lymph Node Metastasis in Early Breast Cancer to validate the method. Our approach not only achieves state-of-the-art performance with an AUC of 0.861 on the test set without missing data, but also yields an AUC of 0.842 when the rate of missing modality is 80\%. This shows the effectiveness of the approach in dealing with multi-modal data and missing modality. Such a model has the potential to improve treatment decision-making for early breast cancer patients who have axillary lymph node metastatic status.
翻译:多模态学习在医学图像分析中引起了广泛关注。利用多模态数据——全切片图像(WSI)和临床信息,能够提升深度学习模型在诊断腋窝淋巴结转移中的性能。然而,由于隐私顾虑、资源有限、互操作性不足等原因,临床信息在实际临床实践中难以收集。尽管通过患者筛选可以确保训练集包含多模态数据以开发模型,但在测试阶段仍可能出现临床信息模态缺失的情况。这通常会导致性能下降,限制了多模态模型在临床中的应用。为解决这一问题,我们提出了一种双向蒸馏框架,包含一个多模态分支和一个单模态分支。单模态分支从多模态分支获取完整的多模态知识,而多模态分支则从单模态分支学习WSI的鲁棒特征。我们在公开的早期乳腺癌淋巴结转移数据集上进行了实验验证。该方法不仅在无缺失数据的测试集上取得了AUC为0.861的先进性能,而且在模态缺失率达80%时仍能达到AUC为0.842。这证明了该方法在处理多模态数据和模态缺失问题上的有效性。该模型有望改善腋窝淋巴结转移状态早期乳腺癌患者的治疗决策。