In non-medical domains, foundation models (FMs) have revolutionized computer vision and language processing through large-scale self-supervised and multimodal learning. Consequently, their rapid adoption in computational pathology was expected to deliver comparable breakthroughs in cancer diagnosis, prognostication, and multimodal retrieval. However, recent systematic evaluations reveal fundamental weaknesses: low diagnostic accuracy, poor robustness, geometric instability, heavy computational demands, and concerning safety vulnerabilities. This short paper examines these shortcomings and argues that they stem from deeper conceptual mismatches between the assumptions underlying generic foundation modeling in mainstream AI and the intrinsic complexity of human tissue. Seven interrelated causes are identified: biological complexity, ineffective self-supervision, overgeneralization, excessive architectural complexity, lack of domain-specific innovation, insufficient data, and a fundamental design flaw related to tissue patch size. These findings suggest that current pathology foundation models remain conceptually misaligned with the nature of tissue morphology and call for a fundamental rethinking of the paradigm itself.
翻译:在非医学领域,基础模型通过大规模自监督与多模态学习,已在计算机视觉和语言处理领域引发革命性变革。因此,人们预期其在计算病理学中的快速应用将为癌症诊断、预后预测及多模态检索带来类似突破。然而,近期系统性评估揭示了其根本性缺陷:诊断准确率低、鲁棒性差、几何不稳定性、计算需求庞大以及令人担忧的安全漏洞。本文审视了这些不足,并指出其根源在于主流人工智能中通用基础建模的基本假设与人体组织内在复杂性之间存在深层的概念错配。研究识别了七个相互关联的原因:生物复杂性、自监督机制失效、过度泛化、架构过度复杂、缺乏领域特异性创新、数据不足以及与组织切片尺寸相关的根本设计缺陷。这些发现表明,当前病理学基础模型在概念上仍与组织形态学本质不相契合,亟需对该范式本身进行根本性反思。