Inevitable domain and task discrepancies in real-world scenarios can impair the generalization performance of the pre-trained deep models for medical data. Therefore, we audaciously propose that we should build a general-purpose medical AI system that can be seamlessly adapted to downstream domains/tasks. Since the domain/task adaption procedures usually involve additional labeling work for the target data, designing a data-efficient adaption algorithm is desired to save the cost of transferring the learned knowledge. Our recent work found that vision-language models (VLMs) are efficient learners with extraordinary cross-domain ability. Therefore, in this work, we further explore the possibility of leveraging pre-trained VLMs as medical foundation models for building general-purpose medical AI, where we thoroughly investigate three machine-learning paradigms, i.e., domain/task-specialized learning, joint learning, and continual learning, for training the VLMs and evaluate their generalization performance on cross-domain and cross-task test sets. To alleviate the catastrophic forgetting during sequential training, we employ rehearsal learning and receive a sharp boost in terms of generalization capability. In a nutshell, our empirical evidence suggests that continual learning may be a practical and efficient learning paradigm for the medical foundation model. And we hope researchers can use our empirical evidence as basement to further explore the path toward medical foundation model.
翻译:真实场景中不可避免的领域和任务差异会损害预训练深度模型在医学数据上的泛化性能。因此,我们大胆提出应构建一种能够无缝适应下游领域/任务的通用医学人工智能系统。由于领域/任务自适应过程通常涉及为目标数据添加标注工作,因此需要设计一种数据高效的自适应算法以节省知识迁移成本。我们近期研究发现,视觉语言模型(VLMs)是具备卓越跨领域能力的高效学习者。为此,本研究进一步探索利用预训练VLMs作为医学基础模型构建通用医学人工智能的可能性,系统研究了三种机器学习范式——领域/任务专属学习、联合学习与持续学习——来训练VLMs,并评估其在跨领域与跨任务测试集上的泛化性能。为缓解序列训练中的灾难性遗忘,我们采用重演学习策略,使泛化能力获得显著提升。简言之,实证证据表明持续学习可能是医学基础模型实用且高效的学习范式,期望研究者能以本研究的实证结果为基础,进一步探索医学基础模型的发展路径。