MOSAIC：一种多语言、分类法无关且计算高效的放射学报告分类方法 (MOSAIC: A Multilingual, Taxonomy-Agnostic, and Computationally Efficient Approach for Radiological Report Classification)

Radiology reports contain rich clinical information that can be used to train imaging models without relying on costly manual annotation. However, existing approaches face critical limitations: rule-based methods struggle with linguistic variability, supervised models require large annotated datasets, and recent LLM-based systems depend on closed-source or resource-intensive models that are unsuitable for clinical use. Moreover, current solutions are largely restricted to English and single-modality, single-taxonomy datasets. We introduce MOSAIC, a multilingual, taxonomy-agnostic, and computationally efficient approach for radiological report classification. Built on a compact open-access language model (MedGemma-4B), MOSAIC supports both zero-/few-shot prompting and lightweight fine-tuning, enabling deployment on consumer-grade GPUs. We evaluate MOSAIC across seven datasets in English, Spanish, French, and Danish, spanning multiple imaging modalities and label taxonomies. The model achieves a mean macro F1 score of 88 across five chest X-ray datasets, approaching or exceeding expert-level performance, while requiring only 24 GB of GPU memory. With data augmentation, as few as 80 annotated samples are sufficient to reach a weighted F1 score of 82 on Danish reports, compared to 86 with the full 1600-sample training set. MOSAIC offers a practical alternative to large or proprietary LLMs in clinical settings. Code and models are open-source. We invite the community to evaluate and extend MOSAIC on new languages, taxonomies, and modalities.

翻译：放射学报告包含丰富的临床信息，可用于训练影像模型，而无需依赖昂贵的人工标注。然而，现有方法面临关键局限：基于规则的方法难以应对语言多样性，监督模型需要大量标注数据集，而近期基于大语言模型（LLM）的系统依赖于闭源或资源密集型模型，不适用于临床环境。此外，现有解决方案主要局限于英语以及单模态、单分类法的数据集。我们提出MOSAIC，一种多语言、分类法无关且计算高效的放射学报告分类方法。该方法基于一个紧凑的开源语言模型（MedGemma-4B），支持零样本/少样本提示以及轻量级微调，可在消费级GPU上部署。我们在涵盖英语、西班牙语、法语和丹麦语的七个数据集上评估MOSAIC，这些数据集跨越多种成像模态和标签分类法。该模型在五个胸部X射线数据集上取得了平均宏观F1分数88，接近或超过专家级性能，同时仅需24 GB的GPU内存。通过数据增强，仅需80个标注样本即可在丹麦语报告上达到加权F1分数82，而使用完整的1600个样本训练集时分数为86。MOSAIC为临床环境中使用大型或专有大语言模型提供了一个实用的替代方案。代码和模型均已开源。我们邀请社区在新的语言、分类法和模态上评估并扩展MOSAIC。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日