Large language models (LMs) have exhibited superior in-context learning (ICL) ability to adopt to target tasks by prompting with a few input-output demonstrations. Towards better ICL, different methods are proposed to select representative demonstrations from existing training corpora. However, such a setting is not aligned with real-world practices, as end-users usually query LMs without accesses to demonstration pools. Inspired by evidence suggesting LMs' zero-shot capabilities are underrated, and the role of demonstrations are primarily for exposing models' intrinsic functionalities, we introduce Self-ICL, a simple framework for zero-shot ICL. Given a test input, Self-ICL first prompts the model to generate pseudo-inputs. Next, the model predicts pseudo-labels for the pseudo-inputs via zero-shot prompting. Finally, we construct pseudo-demonstrations from pseudo-input-label pairs, and perform ICL for the test input. Evaluation on BIG-Bench Hard shows Self-ICL steadily surpasses zero-shot and zero-shot chain-of-thought baselines on head-to-head and all-task average performance. Our findings suggest the possibility to bootstrap LMs' intrinsic capabilities towards better zero-shot performance.
翻译:大型语言模型通过向提示中加入少量输入-输出示例,展现出强大的上下文学习能力,可适配至目标任务。为提升ICL性能,现有方法多从训练语料中选择代表性示例。然而,这种设定与现实使用场景存在偏差——终端用户通常在没有示例库的情况下直接查询语言模型。受语言模型零样本能力被低估,以及示例主要功能是激活模型内在能力这一发现的启发,我们提出Self-ICL这一零样本ICL简化框架。对于给定测试输入,Self-ICL首先引导模型生成伪输入,随后通过零样本提示预测伪标签,继而构建伪示例对(伪输入-伪标签),最后基于这些示例对测试输入执行ICL。在BIG-Bench Hard基准测试上的评估表明,Self-ICL在逐任务对比和全任务平均性能上均稳定超越零样本基线与零样本思维链基线。我们的研究揭示了通过引导语言模型内在能力以提升零样本性能的可能性。