Large Language Models (LLMs) have demonstrated their efficacy across a broad spectrum of tasks in healthcare applications. However, often LLMs need to be fine-tuned on task-specific expert annotated data to achieve optimal performance, which can be expensive and time consuming. In this study, we fine-tune PaLM-2 with parameter efficient fine-tuning (PEFT) using noisy labels obtained from gemini-pro 1.0 for the detection of Schedule-of-Event (SoE) tables, which specify care plan in clinical trial protocols. We introduce a filtering mechanism to select high-confidence labels for this table classification task, thereby reducing the noise in the auto-generated labels. We show that fine-tuned PaLM-2 with those labels achieves performance that exceeds the gemini-pro 1.0 and other LLMs. Furthermore, its performance is close to a PaLM-2 fine-tuned on labels obtained from non-expert annotators. Our results show that leveraging LLM-generated labels through powerful models like gemini-pro can potentially serve as a viable strategy for improving LLM performance through fine-tuning in specialized tasks, particularly in domains where expert annotations are scarce, expensive, or time-consuming to obtain.
翻译:大语言模型(LLMs)在医疗应用领域的各类任务中已展现出显著效能。然而,为达到最优性能,LLMs通常需要使用特定任务领域的专家标注数据进行微调,这一过程成本高昂且耗时。本研究采用参数高效微调(PEFT)方法,利用从gemini-pro 1.0获取的噪声标签对PaLM-2进行微调,以检测临床试验方案中指定护理计划的事件时间表(SoE)表格。我们引入了一种过滤机制,为这一表格分类任务选择高置信度标签,从而降低自动生成标签中的噪声。实验表明,使用这些标签微调后的PaLM-2性能超越了gemini-pro 1.0及其他LLMs,且其表现接近使用非专家标注者提供的标签微调后的PaLM-2。研究结果表明,通过gemini-pro等强大模型生成的LLM标签,可以作为在专业任务中通过微调提升LLM性能的可行策略,尤其在专家标注稀缺、昂贵或耗时的领域。