While large language models (LLMs) such as ChatGPT and PaLM have demonstrated remarkable performance in various language understanding and generation tasks, their capabilities in complex reasoning and intricate knowledge utilization still fall short of human-level proficiency. Recent studies have established the effectiveness of prompts in steering LLMs towards generating desired outputs. Building on these insights, we introduce a novel framework that harnesses the potential of large-scale pre-trained language models, to iteratively enhance performance of the LLMs. Our framework incorporates three components: \textit{Normal CoT}, a \textit{Convincer}, and an \textit{Answerer}. It processes the output of a typical few-shot chain-of-thought prompt, assesses the correctness of the response, scrutinizes the answer, refines the reasoning, and ultimately produces a new solution. Experimental results on the 7 datasets of miscellaneous problems validate the efficacy of the Self-Convince framework, achieving substantial improvements compared to the baselines. This study contributes to the burgeoning body of research focused on integrating pre-trained language models with tailored prompts and iterative refinement processes to augment their performance in complex tasks.
翻译:尽管ChatGPT和PaLM等大语言模型在各类语言理解与生成任务中展现出卓越性能,但其在复杂推理和深度知识应用方面仍未能达到人类水平。近期研究表明,提示策略能有效引导大语言模型生成预期输出。基于这些发现,我们提出了一种创新框架,利用大规模预训练语言模型的潜力,通过迭代方式持续提升其性能。该框架包含三个核心组件:标准思维链、说服者及回答者。它首先处理典型的小样本思维链提示输出,随后评估回答正确性、审查回复内容、优化推理过程,最终生成全新解决方案。在涵盖多元问题的七个数据集上的实验结果表明,该自我信服框架相比基线方法取得了显著性能提升。本研究为增强预训练语言模型在复杂任务中性能的提示工程与迭代优化研究体系贡献了新的突破。