The recent success of Large Language Models (LLMs) has catalyzed an increasing interest in their self-correction capabilities. This paper presents a comprehensive investigation into the intrinsic self-correction of LLMs, attempting to address the ongoing debate about its feasibility. Our research has identified an important latent factor - the "confidence" of LLMs - during the self-correction process. Overlooking this factor may cause the models to over-criticize themselves, resulting in unreliable conclusions regarding the efficacy of self-correction. We have experimentally observed that LLMs possess the capability to understand the "confidence" in their own responses. It motivates us to develop an "If-or-Else" (IoE) prompting framework, designed to guide LLMs in assessing their own "confidence", facilitating intrinsic self-corrections. We conduct extensive experiments and demonstrate that our IoE-based Prompt can achieve a consistent improvement regarding the accuracy of self-corrected responses over the initial answers. Our study not only sheds light on the underlying factors affecting self-correction in LLMs, but also introduces a practical framework that utilizes the IoE prompting principle to efficiently improve self-correction capabilities with "confidence". The code is available at https://github.com/MBZUAI-CLeaR/IoE-Prompting.git.
翻译:大语言模型(LLMs)的最新成功激发了对其自我纠正能力日益增长的兴趣。本文对LLMs的内在自我纠正进行了全面研究,试图解决关于其可行性的持续争论。我们的研究识别出了自我纠正过程中一个重要的潜在因素——LLMs的“置信度”。忽视这一因素可能导致模型过度自我批评,从而得出关于自我纠正效果不可靠的结论。我们通过实验观察到,LLMs具备理解自身响应中“置信度”的能力。这促使我们开发了一种“如果-否则”(IoE)提示框架,旨在引导LLMs评估自身的“置信度”,促进内在自我纠正。我们进行了大量实验,并证明基于IoE的提示能够使自我纠正后响应的准确性相较于初始答案获得持续提升。我们的研究不仅揭示了影响LLMs自我纠正的潜在因素,还引入了一种实用框架,该框架利用IoE提示原则通过“置信度”有效提升自我纠正能力。代码可在 https://github.com/MBZUAI-CLeaR/IoE-Prompting.git 获取。