Instruction-tuned generative Large language models (LLMs) like ChatGPT and Bloomz possess excellent generalization abilities, but they face limitations in understanding radiology reports, particularly in the task of generating the IMPRESSIONS section from the FINDINGS section. They tend to generate either verbose or incomplete IMPRESSIONS, mainly due to insufficient exposure to medical text data during training. We present a system which leverages large-scale medical text data for domain-adaptive pre-training of instruction-tuned LLMs to enhance its medical knowledge and performance on specific medical tasks. We show that this system performs better in a zero-shot setting than a number of pretrain-and-finetune adaptation methods on the IMPRESSIONS generation task, and ranks 1st among participating systems in Task 1B: Radiology Report Summarization at the BioNLP 2023 workshop.
翻译:像ChatGPT和Bloomz这类经过指令微调的生成式大语言模型具备出色的泛化能力,但在理解放射报告方面存在局限,尤其是在从“所见”部分生成“印象”部分的任务中。这些模型往往会生成冗长或不完整的“印象”,主要原因是训练过程中接触的医学文本数据不足。我们提出了一种系统,通过利用大规模医学文本数据进行领域自适应预训练,增强指令微调大语言模型的医学知识及其在特定医学任务上的性能。研究表明,在零样本场景下,该系统在“印象”生成任务上的表现优于多种“预训练加微调”的适应方法,并在BioNLP 2023研讨会任务1B(放射报告摘要)的参与系统中排名第一。