The emergence of generative Large Language Models (LLMs) emphasizes the need for accurate and efficient prompting approaches. LLMs are often applied in Few-Shot Learning (FSL) contexts, where tasks are executed with minimal training data. FSL has become popular in many Artificial Intelligence (AI) subdomains, including AI for health. Rare diseases, affecting a small fraction of the population, inherently require FSL techniques due to limited data availability, though manual data collection and annotation is costly and time-consuming. In this paper, we propose Models-Vote Prompting (MVP), a flexible prompting approach for improving the performance of LLM queries in FSL settings. MVP works by prompting numerous LLMs to perform the same tasks and then conducting a majority vote on the resulting outputs. This method achieves improved results to any one model in the ensemble on one-shot rare disease identification and classification tasks. We also release a novel rare disease dataset for FSL, available to those who agreed to the MIMIC-IV Data Use Agreement (DUA). Furthermore, in using MVP, each model is prompted multiple times, substantially increasing the time needed for manual annotation, and to address this, we assess the feasibility of using JSON for automating generative LLM evaluation.
翻译:生成式大型语言模型(LLMs)的兴起凸显了对准确高效提示方法的需求。LLMs常用于少样本学习(FSL)场景,即在极少训练数据下执行任务。FSL已成为许多人工智能(AI)子领域(包括医疗AI)的流行技术。由于数据可用性有限,罕见疾病(影响少数人群)本质上需要FSL技术,而人工数据收集和注释既昂贵又耗时。本文提出模型投票提示(MVP),一种灵活的提示方法,用于提升FSL环境下LLM查询的性能。MVP通过提示多个LLMs执行相同任务,然后对输出结果进行多数投票。该方法在一次性罕见疾病识别与分类任务中,比集成中任何单一模型均取得更优结果。我们还发布了一个用于FSL的新型罕见疾病数据集,该数据集向同意MIMIC-IV数据使用协议(DUA)的研究者开放。此外,使用MVP时每个模型需被多次提示,显著增加了人工注释所需时间;为解决此问题,我们评估了使用JSON自动化生成式LLM评估的可行性。