The emergence of generative Large Language Models (LLMs) emphasizes the need for accurate and efficient prompting approaches. LLMs are often applied in Few-Shot Learning (FSL) contexts, where tasks are executed with minimal training data. FSL has become popular in many Artificial Intelligence (AI) subdomains, including AI for health. Rare diseases affect a small fraction of the population. Rare disease identification from clinical notes inherently requires FSL techniques due to limited data availability. Manual data collection and annotation is both expensive and time-consuming. In this paper, we propose Models-Vote Prompting (MVP), a flexible prompting approach for improving the performance of LLM queries in FSL settings. MVP works by prompting numerous LLMs to perform the same tasks and then conducting a majority vote on the resulting outputs. This method achieves improved results to any one model in the ensemble on one-shot rare disease identification and classification tasks. We also release a novel rare disease dataset for FSL, available to those who signed the MIMIC-IV Data Use Agreement (DUA). Furthermore, in using MVP, each model is prompted multiple times, substantially increasing the time needed for manual annotation, and to address this, we assess the feasibility of using JSON for automating generative LLM evaluation.
翻译:生成式大型语言模型(LLMs)的出现凸显了准确高效提示方法的必要性。LLMs常被应用于少样本学习(FSL)场景,即仅用极少量训练数据执行任务。FSL已在人工智能(AI)众多子领域(包括医疗AI)中广泛应用。罕见病仅影响少数人群,由于数据可用性有限,从临床笔记中识别罕见病天然需要FSL技术。人工数据收集与标注既昂贵又耗时。本文提出模型投票提示(MVP),一种灵活的提示方法,用于提升LLMs在FSL场景下的查询性能。MVP通过提示多个LLMs执行相同任务,并对输出结果进行多数投票,从而在单样本罕见病识别与分类任务上取得优于集成中任意单一模型的效果。我们还发布了一个面向FSL的新型罕见病数据集,该数据集对签署MIMIC-IV数据使用协议(DUA)的研究者开放。此外,使用MVP时每个模型被多次提示,显著增加了人工标注所需时间;为解决此问题,我们评估了利用JSON实现生成式LLM评估自动化的可行性。