Pathology computing has dramatically improved pathologists' workflow and diagnostic decision-making processes. Although computer-aided diagnostic systems have shown considerable value in whole slide image (WSI) analysis, the problem of multi-classification under sample imbalance remains an intractable challenge. To address this, we propose learning fine-grained information by generating sub-bags with feature distributions similar to the original WSIs. Additionally, we utilize a pseudo-bag generation algorithm to further leverage the abundant and redundant information in WSIs, allowing efficient training in unbalanced-sample multi-classification tasks. Furthermore, we introduce an affinity-based sample selection and curriculum contrastive learning strategy to enhance the stability of model representation learning. Unlike previous approaches, our framework transitions from learning bag-level representations to understanding and exploiting the feature distribution of multi-instance bags. Our method demonstrates significant performance improvements on three datasets, including tumor classification and lymph node metastasis. On average, it achieves a 4.39-point improvement in F1 score compared to the second-best method across the three tasks, underscoring its superior performance.
翻译:病理计算显著改善了病理学家的工作流程与诊断决策过程。尽管计算机辅助诊断系统在全切片图像分析中展现出可观价值,但样本不平衡下的多分类问题仍是棘手挑战。为解决此问题,我们提出通过生成与原始WSI特征分布相似的子包来学习细粒度信息。此外,我们采用伪包生成算法进一步利用WSI中丰富冗余的信息,从而在不平衡样本多分类任务中实现高效训练。进一步地,我们引入基于亲和力的样本选择与课程对比学习策略,以增强模型表征学习的稳定性。与先前方法不同,我们的框架从学习包级表征转向理解并利用多实例包的特征分布。我们的方法在三个数据集(包括肿瘤分类与淋巴结转移任务)上均展现出显著的性能提升。平均而言,在三个任务中相较于次优方法实现了F1分数4.39分的提升,彰显了其优越性能。