The advent of foundation models (FMs) as an emerging suite of AI techniques has struck a wave of opportunities in computational healthcare. The interactive nature of these models, guided by pre-training data and human instructions, has ignited a data-centric AI paradigm that emphasizes better data characterization, quality, and scale. In healthcare AI, obtaining and processing high-quality clinical data records has been a longstanding challenge, ranging from data quantity, annotation, patient privacy, and ethics. In this survey, we investigate a wide range of data-centric approaches in the FM era (from model pre-training to inference) towards improving the healthcare workflow. We discuss key perspectives in AI security, assessment, and alignment with human values. Finally, we offer a promising outlook of FM-based analytics to enhance the performance of patient outcome and clinical workflow in the evolving landscape of healthcare and medicine. We provide an up-to-date list of healthcare-related foundation models and datasets at https://github.com/Yunkun-Zhang/Data-Centric-FM-Healthcare .
翻译:基础模型作为新兴的人工智能技术套件,其出现为医疗计算领域带来了前所未有的机遇。这些模型以前期训练数据和人类指令为指导的交互特性,催生了以数据为中心的人工智能范式,该范式强调更好的数据表征、质量与规模。在医疗人工智能领域,获取和处理高质量临床数据记录一直存在长期挑战,涉及数据数量、标注、患者隐私和伦理等多个方面。本综述系统研究了基础模型时代(从模型预训练到推理)广泛的数据中心化方法,以改进医疗工作流程。我们探讨了人工智能安全性、评估以及与人类价值观对齐等关键视角。最后,我们展望了基于基础模型的分析方法在医疗健康领域不断发展的前景中,如何提升患者预后和临床工作流程的效能。我们在 https://github.com/Yunkun-Zhang/Data-Centric-FM-Healthcare 提供了最新的医疗相关基础模型与数据集列表。