Modern large language models (LLMs) have demonstrated impressive capabilities at sophisticated tasks, often through step-by-step reasoning similar to humans. This is made possible by their strong few and zero-shot abilities -- they can effectively learn from a handful of handcrafted, completed responses ("in-context examples"), or are prompted to reason spontaneously through specially designed triggers. Nonetheless, some limitations have been observed. First, performance in the few-shot setting is sensitive to the choice of examples, whose design requires significant human effort. Moreover, given the diverse downstream tasks of LLMs, it may be difficult or laborious to handcraft per-task labels. Second, while the zero-shot setting does not require handcrafting, its performance is limited due to the lack of guidance to the LLMs. To address these limitations, we propose Consistency-based Self-adaptive Prompting (COSP), a novel prompt design method for LLMs. Requiring neither handcrafted responses nor ground-truth labels, COSP selects and builds the set of examples from the LLM zero-shot outputs via carefully designed criteria that combine consistency, diversity and repetition. In the zero-shot setting for three different LLMs, we show that using only LLM predictions, COSP improves performance up to 15% compared to zero-shot baselines and matches or exceeds few-shot baselines for a range of reasoning tasks.
翻译:现代大型语言模型(LLMs)展现出处理复杂任务的卓越能力,通常通过类似人类的逐步推理实现。这得益于其强大的少样本和零样本能力——它们能有效从少量人工设计的完整回答(即"上下文示例")中学习,或通过特殊设计的提示自主生成推理。然而,现有方法仍存在局限:首先,少样本设定的性能对示例选择高度敏感,而示例设计需要大量人工投入;此外,面对LLMs多样化的下游任务,为每个任务手工标注标签可能困难且耗时。其次,零样本设定虽无需人工设计,但因缺乏对LLMs的引导,其性能受限。为解决上述问题,我们提出基于一致性的自适应提示(COSP)——一种面向LLMs的新型提示设计方法。COSP无需人工设计回答或真实标签,通过精心设计的融合一致性、多样性与重复性的筛选准则,从LLM的零样本输出中自动选取并构建示例集。在三个不同LLMs的零样本设定下,我们证明仅使用LLM预测结果,COSP相较于零样本基线可提升最高15%的性能,并在多种推理任务中达到或超越少样本基线。