A Survey of Large Language Models for Healthcare: from Data, Technology, and Applications to Accountability and Ethics

The utilization of large language models (LLMs) in the Healthcare domain has generated both excitement and concern due to their ability to effectively respond to freetext queries with certain professional knowledge. This survey outlines the capabilities of the currently developed LLMs for Healthcare and explicates their development process, with the aim of providing an overview of the development roadmap from traditional Pretrained Language Models (PLMs) to LLMs. Specifically, we first explore the potential of LLMs to enhance the efficiency and effectiveness of various Healthcare applications highlighting both the strengths and limitations. Secondly, we conduct a comparison between the previous PLMs and the latest LLMs, as well as comparing various LLMs with each other. Then we summarize related Healthcare training data, training methods, optimization strategies, and usage. Finally, the unique concerns associated with deploying LLMs in Healthcare settings are investigated, particularly regarding fairness, accountability, transparency and ethics. Our survey provide a comprehensive investigation from perspectives of both computer science and Healthcare specialty. Besides the discussion about Healthcare concerns, we supports the computer science community by compiling a collection of open source resources, such as accessible datasets, the latest methodologies, code implementations, and evaluation benchmarks in the Github. Summarily, we contend that a significant paradigm shift is underway, transitioning from PLMs to LLMs. This shift encompasses a move from discriminative AI approaches to generative AI approaches, as well as a shift from model-centered methodologies to datacentered methodologies.

翻译：大型语言模型（LLMs）在医疗领域的应用，因其能够借助特定专业知识有效回应自由文本查询，既引发了兴奋也带来了担忧。本综述概述了当前为医疗领域开发的大型语言模型的能力，并阐释了其发展过程，旨在提供从传统预训练语言模型（PLMs）到大型语言模型的发展路线图概览。具体而言，我们首先探讨了大型语言模型在提升各类医疗应用效率与效果方面的潜力，同时指出了其优势与局限。其次，我们对先前的预训练语言模型和最新的大型语言模型进行了比较，并对不同大型语言模型之间进行了相互对比。随后，我们总结了相关的医疗训练数据、训练方法、优化策略及使用方法。最后，我们研究了在医疗环境中部署大型语言模型所涉及的独特顾虑，特别是在公平性、问责性、透明度和伦理方面。本综述从计算机科学和医疗专业两个角度进行了全面调查。除了讨论医疗方面的顾虑，我们还通过整理开源资源合集（如Github上可访问的数据集、最新方法论、代码实现和评估基准），为计算机科学社区提供了支持。总而言之，我们认为从预训练语言模型到大型语言模型的重大范式转变正在发生。这一转变包含了从判别式人工智能方法到生成式人工智能方法的演进，以及从以模型为中心的方法论向以数据为中心的方法论的转变。