Progressive-Hint Prompting Improves Reasoning in Large Language Models

The performance of Large Language Models (LLMs) in reasoning tasks depends heavily on prompt design, with Chain-of-Thought (CoT) and self-consistency being critical methods that enhance this ability. However, these methods do not fully exploit the answers generated by the LLM to guide subsequent responses. This paper proposes a new prompting method, named Progressive-Hint Prompting (PHP), that enables automatic multiple interactions between users and LLMs by using previously generated answers as hints to progressively guide toward the correct answers. PHP is orthogonal to CoT and self-consistency, making it easy to combine with state-of-the-art techniques to further improve performance. We conducted extensive and comprehensive experiments on seven benchmarks. The results show that PHP significantly improves accuracy while remaining highly efficient. For instance, with text-davinci-003, we observed a 4.2% improvement on GSM8K with greedy decoding compared to Complex CoT, and a 46.17% reduction in sample paths with self-consistency. With GPT-4 and PHP, we achieve state-of-the-art performances on SVAMP (89.1% -> 91.9%), GSM8K (92% -> 95.5%), AQuA (76.4% -> 79.9%) and MATH (50.3% -> 53.9%).

翻译：大语言模型（LLMs）在推理任务中的表现高度依赖提示设计，其中思维链（Chain-of-Thought, CoT）和自一致性是增强该能力的关键方法。然而，这些方法并未充分利用LLMs生成的答案来指导后续回答。本文提出一种名为渐进提示法（Progressive-Hint Prompting, PHP）的新型提示方法，通过将先前生成的答案作为提示，逐步引导LLMs走向正确结果，从而实现用户与模型间的自动多重交互。PHP与CoT及自一致性正交，易于与前沿技术结合以进一步提升性能。我们在七个基准上进行了广泛而全面的实验。结果表明，PHP在保持高效率的同时显著提升了准确性。例如，在text-davinci-003上，采用贪婪解码时，相较于复杂CoT，我们在GSM8K上实现了4.2%的提升；采用自一致性时，样本路径减少了46.17%。结合GPT-4与PHP，我们在SVAMP（89.1%→91.9%）、GSM8K（92%→95.5%）、AQuA（76.4%→79.9%）和MATH（50.3%→53.9%）上达到了最先进性能。

相关内容

PHP

关注 296

PHP 是英文超级文本预处理语言（PHP：Hypertext Preprocessor）的缩写。PHP 是一种 HTML 内嵌式的语言，是一种在服务器端执行的嵌入 HTML 文档的脚本语言，语言的风格有类似于 C 语言，被广泛的运用。PHP 具有非常强大的功能，所有的 CGI 的功能 PHP 都能实现，而且支持几乎所有流行的数据库以及操作系统。

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日