Progressive-Hint Prompting Improves Reasoning in Large Language Models

The performance of Large Language Models (LLMs) in reasoning tasks depends heavily on prompt design, with Chain-of-Thought (CoT) and self-consistency being critical methods that enhance this ability. However, these methods do not fully exploit the answers generated by the LLM to guide subsequent responses. This paper proposes a new prompting method, named Progressive-Hint Prompting (PHP), that enables automatic multiple interactions between users and LLMs by using previously generated answers as hints to progressively guide toward the correct answers. PHP is orthogonal to CoT and self-consistency, making it easy to combine with state-of-the-art techniques to further improve performance. We conducted extensive and comprehensive experiments on seven benchmarks. The results show that PHP significantly improves accuracy while remaining highly efficient. For instance, with text-davinci-003, we observed a 4.2% improvement on GSM8K with greedy decoding compared to Complex CoT, and a 46.17% reduction in sample paths with self-consistency. With GPT-4 and PHP, we achieve state-of-the-art performances on SVAMP (89.1% -> 91.9%), GSM8K (92% -> 95.5%), AQuA (76.4% -> 79.9%) and MATH (50.3% -> 53.9%).

翻译：大型语言模型（LLMs）在推理任务中的表现高度依赖于提示设计，其中思维链（CoT）和自一致性是增强该能力的关键方法。然而，这些方法并未充分利用LLM生成的答案来指导后续响应。本文提出了一种名为渐进式提示（PHP）的新型提示方法，通过将先前生成的答案作为提示，逐步引导用户与LLM之间的自动多次交互，从而逼近正确答案。PHP与CoT和自一致性方法正交，易于与现有最优技术结合以进一步提升性能。我们在七个基准上开展了全面且广泛的实验。结果表明，PHP在显著提升准确率的同时保持高效性。例如，在使用text-davinci-003模型时，与复杂CoT相比，贪心解码在GSM8K上实现了4.2%的提升，且自一致性所需的样本路径减少了46.17%。结合GPT-4与PHP，我们在SVAMP（89.1%→91.9%）、GSM8K（92%→95.5%）、AQuA（76.4%→79.9%）和MATH（50.3%→53.9%）上取得了当前最优性能。

相关内容

PHP

关注 296

PHP 是英文超级文本预处理语言（PHP：Hypertext Preprocessor）的缩写。PHP 是一种 HTML 内嵌式的语言，是一种在服务器端执行的嵌入 HTML 文档的脚本语言，语言的风格有类似于 C 语言，被广泛的运用。PHP 具有非常强大的功能，所有的 CGI 的功能 PHP 都能实现，而且支持几乎所有流行的数据库以及操作系统。

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

专知会员服务

13+阅读 · 2022年3月12日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日