We systematically investigate lightweight strategies to adapt large language models (LLMs) for the task of radiology report summarization (RRS). Specifically, we focus on domain adaptation via pretraining (on natural language, biomedical text, or clinical text) and via discrete prompting or parameter-efficient fine-tuning. Our results consistently achieve best performance by maximally adapting to the task via pretraining on clinical text and fine-tuning on RRS examples. Importantly, this method fine-tunes a mere 0.32% of parameters throughout the model, in contrast to end-to-end fine-tuning (100% of parameters). Additionally, we study the effect of in-context examples and out-of-distribution (OOD) training before concluding with a radiologist reader study and qualitative analysis. Our findings highlight the importance of domain adaptation in RRS and provide valuable insights toward developing effective natural language processing solutions for clinical tasks.
翻译:摘要:本文系统研究了针对放射学报告摘要(RRS)任务对大语言模型(LLMs)进行轻量级适配的策略。具体而言,我们重点探讨了通过预训练(在自然语言、生物医学文本或临床文本上)以及离散提示或参数高效微调实现领域自适应的方案。实验结果表明,通过在临床文本上预训练并在RRS示例上微调来最大化任务适配度时,始终取得最佳性能。值得注意的是,该方法仅微调整个模型0.32%的参数,而端到端微调则需要调整全部参数。此外,我们还在开展放射科医生阅读者研究与定性分析之前,深入研究了上下文示例和分布外(OOD)训练的影响。研究结果凸显了领域自适应在RRS中的重要性,并为开发面向临床任务的高效自然语言处理方案提供了宝贵见解。