Whereas the recent emergence of large language models (LLMs) like ChatGPT has exhibited impressive general performance, it still has a large gap with fully-supervised models on specific tasks such as multi-span question answering. Previous researches found that in-context learning is an effective approach to exploiting LLM, by using a few task-related labeled data as demonstration examples to construct a few-shot prompt for answering new questions. A popular implementation is to concatenate a few questions and their correct answers through simple templates, informing LLM of the desired output. In this paper, we propose a novel way of employing labeled data such that it also informs LLM of some undesired output, by extending demonstration examples with feedback about answers predicted by an off-the-shelf model, e.g., correct, incorrect, or incomplete. Experiments on three multi-span question answering datasets as well as a keyphrase extraction dataset show that our new prompting strategy consistently improves LLM's in-context learning performance.
翻译:尽管以ChatGPT为代表的大型语言模型(LLMs)近期展现出令人瞩目的通用性能,但在多跨度问答等特定任务上,其表现仍与全监督模型存在显著差距。既往研究发现,通过将少量任务相关标注数据作为示例构建少样本提示来回答新问题,上下文学习是利用LLM的有效方法。一种常见实现方式是通过简单模板拼接若干问题及其正确答案,向LLM告知预期输出。本文提出一种利用标注数据的新范式:通过将现成模型预测的答案反馈(如正确、错误或不完整)扩展至演示示例,使LLM同时了解非预期输出。在三个多跨度问答数据集及一个关键词抽取数据集上的实验表明,我们的新型提示策略能持续提升LLM的上下文学习性能。