Latent Skill Discovery for Chain-of-Thought Reasoning

Recent advances in Large Language Models (LLMs) have led to an emergent ability of chain-of-thought (CoT) prompting, a prompt reasoning strategy that adds intermediate rationale steps between questions and answers to construct prompts. Conditioned on these prompts, LLMs can effectively learn in context to generate rationales that lead to more accurate answers than when answering the same question directly. To design LLM prompts, one important setting, called demonstration selection, considers selecting demonstrations from an example bank. Existing methods use various heuristics for this selection, but for CoT prompting, which involves unique rationales, it is essential to base the selection upon the intrinsic skills that CoT rationales need, for instance, the skills of addition or subtraction for math word problems. To address this requirement, we introduce a novel approach named Reasoning Skill Discovery (RSD) that use unsupervised learning to create a latent space representation of rationales, called a reasoning skill. Simultaneously, RSD learns a reasoning policy to determine the required reasoning skill for a given question. This can then guide the selection of examples that demonstrate the required reasoning skills. Our approach offers several desirable properties: it is (1) theoretically grounded, (2) sample-efficient, requiring no LLM inference or manual prompt design, and (3) LLM-agnostic. Empirically, RSD outperforms existing methods by up to 6% in terms of the answer accuracy across multiple reasoning tasks.

翻译：近期大语言模型（LLMs）的进展催生了链式思维提示这一新兴能力，这是一种在问题与答案之间添加中间推理步骤以构建提示的提示推理策略。基于这些提示，LLMs能通过上下文有效学习生成推理过程，从而比直接回答问题获得更准确的答案。在设计LLM提示时，一个重要设置称为示例选择，需从示例库中选取示例。现有方法对此使用多种启发式策略，但对于包含独特推理过程的链式思维提示，必须基于链式思维推理所需的内在技能（如数学文字题中的加减法技能）进行选择。为应对这一需求，我们提出名为推理技能发现（RSD）的新方法，通过无监督学习创建推理过程的潜在空间表征——即推理技能。同时，RSD学习一种推理策略来确定给定问题所需的推理技能，进而指导选择展示所需推理技能的示例。本方法具备以下理想特性：（1）理论严谨，（2）样本高效，无需LLM推理或手动提示设计，（3）与LLM无关。实验表明，RSD在多项推理任务的答案准确率上较现有方法最高提升6%。

相关内容

大语言模型

关注 67

大语言模型是基于海量文本数据训练的深度学习模型。它不仅能够生成自然语言文本，还能够深入理解文本含义，处理各种自然语言任务，如文本摘要、问答、翻译等。2023年，大语言模型及其在人工智能领域的应用已成为全球科技研究的热点，其在规模上的增长尤为引人注目，参数量已从最初的十几亿跃升到如今的一万亿。参数量的提升使得模型能够更加精细地捕捉人类语言微妙之处，更加深入地理解人类语言的复杂性。在过去的一年里，大语言模型在吸纳新知识、分解复杂任务以及图文对齐等多方面都有显著提升。随着技术的不断成熟，它将不断拓展其应用范围，为人类提供更加智能化和个性化的服务，进一步改善人们的生活和生产方式。

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日