Developing ChatGPT for Biology and Medicine: A Complete Review of Biomedical Question Answering

ChatGPT explores a strategic blueprint of question answering (QA) in delivering medical diagnosis, treatment recommendations, and other healthcare support. This is achieved through the increasing incorporation of medical domain data via natural language processing (NLP) and multimodal paradigms. By transitioning the distribution of text, images, videos, and other modalities from the general domain to the medical domain, these techniques have expedited the progress of medical domain question answering (MDQA). They bridge the gap between human natural language and sophisticated medical domain knowledge or expert manual annotations, handling large-scale, diverse, unbalanced, or even unlabeled data analysis scenarios in medical contexts. Central to our focus is the utilizing of language models and multimodal paradigms for medical question answering, aiming to guide the research community in selecting appropriate mechanisms for their specific medical research requirements. Specialized tasks such as unimodal-related question answering, reading comprehension, reasoning, diagnosis, relation extraction, probability modeling, and others, as well as multimodal-related tasks like vision question answering, image caption, cross-modal retrieval, report summarization, and generation, are discussed in detail. Each section delves into the intricate specifics of the respective method under consideration. This paper highlights the structures and advancements of medical domain explorations against general domain methods, emphasizing their applications across different tasks and datasets. It also outlines current challenges and opportunities for future medical domain research, paving the way for continued innovation and application in this rapidly evolving field.

翻译：ChatGPT探索了在提供医疗诊断、治疗建议及其他医疗支持中问答系统的战略蓝图。这通过日益整合的自然语言处理（NLP）和多模态范式实现的医学领域数据得以完成。通过将文本、图像、视频及其他模态的分布从通用领域迁移至医学领域，这些技术加速了医学领域问答（MDQA）的进展。它们弥合了人类自然语言与复杂医学领域知识或专家手动标注之间的鸿沟，能够处理医学语境中大规模、多样化、不均衡甚至无标签的数据分析场景。本文核心聚焦于利用语言模型和多模态范式进行医学问答，旨在为研究界根据具体医学研究需求选择合适的机制提供指导。详细讨论了单模态相关问答、阅读理解、推理、诊断、关系抽取、概率建模等专项任务，以及视觉问答、图像描述、跨模态检索、报告摘要与生成等多模态相关任务。每个部分深入剖析了所讨论方法的复杂细节。本文强调了医学领域探索相较于通用领域方法的结构与进展，突出它们在不同任务和数据集上的应用。同时概述了当前医学领域研究所面临的挑战及未来机遇，为这一快速发展领域的持续创新与应用铺平道路。