We investigate the effectiveness of fine-tuning large language models (LLMs) on small medical datasets for text classification and named entity recognition tasks. Using a German cardiology report dataset and the i2b2 Smoking Challenge dataset, we demonstrate that fine-tuning small LLMs locally on limited training data can improve performance achieving comparable results to larger models. Our experiments show that fine-tuning improves performance on both tasks, with notable gains observed with as few as 200-300 training examples. Overall, the study highlights the potential of task-specific fine-tuning of LLMs for automating clinical workflows and efficiently extracting structured data from unstructured medical text.
翻译:本研究探讨了在小型医疗数据集上微调大语言模型(LLMs)用于文本分类和命名实体识别任务的有效性。通过使用德语心内科报告数据集和i2b2吸烟挑战数据集,我们证明了在有限训练数据上本地微调小型LLMs能够提升性能,并取得与更大模型相当的结果。实验表明,微调能有效提升两项任务的性能,仅使用200-300个训练样本即可观察到显著效果提升。总体而言,本研究凸显了针对特定任务微调LLMs在自动化临床工作流程以及从非结构化医疗文本中高效提取结构化数据方面的潜力。