Large Language Models (LLMs) have emerged as a groundbreaking technology with their unparalleled text generation capabilities across various applications. Nevertheless, concerns persist regarding the accuracy and appropriateness of their generated content. A contemporary methodology, self-correction, has been proposed as a remedy to these issues. Building upon this premise, this paper critically examines the role and efficacy of self-correction within LLMs, shedding light on its true potential and limitations. Central to our investigation is the notion of intrinsic self-correction, whereby an LLM attempts to correct its initial responses based solely on its inherent capabilities, without the crutch of external feedback. In the context of reasoning, our research indicates that LLMs struggle to self-correct their responses without external feedback, and at times, their performance even degrades after self-correction. Drawing from these insights, we offer suggestions for future research and practical applications in this field.
翻译:大语言模型(LLMs)作为一种突破性技术,凭借其跨应用场景的无与伦比的文本生成能力而崭露头角。然而,关于其生成内容的准确性与恰当性的疑虑仍持续存在。当前一种名为“自我修正”的方法被提出作为解决这些问题的补救措施。基于这一前提,本文批判性地审视了LLMs中自我修正的作用与效能,揭示了其真实潜力与局限性。我们研究的核心在于“内在自我修正”这一概念,即LLM仅凭其固有能力(不依赖外部反馈)尝试修正其初始响应。在推理背景下,研究表明,若无外部反馈,LLMs难以实现自我修正,有时甚至会在自我修正后出现性能下降。基于这些发现,我们为该领域的未来研究与实际应用提出了相应建议。