Large Language Models (LLMs) have emerged as a groundbreaking technology with their unparalleled text generation capabilities across various applications. Nevertheless, concerns persist regarding the accuracy and appropriateness of their generated content. A contemporary methodology, self-correction, has been proposed as a remedy to these issues. Building upon this premise, this paper critically examines the role and efficacy of self-correction within LLMs, shedding light on its true potential and limitations. Central to our investigation is the notion of intrinsic self-correction, whereby an LLM attempts to correct its initial responses based solely on its inherent capabilities, without the crutch of external feedback. In the context of reasoning, our research indicates that LLMs struggle to self-correct their responses without external feedback, and at times, their performance might even degrade post self-correction. Drawing from these insights, we offer suggestions for future research and practical applications in this field.
翻译:大型语言模型(LLM)作为一项突破性技术,凭借其跨应用领域的卓越文本生成能力而崭露头角。然而,其生成内容的准确性与恰当性仍令人担忧。当前提出的一种补救方法是自我修正机制。基于这一前提,本文批判性地审视了大型语言模型中自我修正的作用与效果,揭示了其真实潜力与局限性。我们的研究核心在于内在自我修正的概念——即大型语言模型仅依赖自身能力而非外部反馈,尝试修正其初始响应。在推理任务背景下,研究表明:若无外部反馈,大型语言模型难以对回答进行自我修正,甚至在自我修正后性能可能出现下降。基于这些发现,我们为该领域的未来研究与实践应用提出了建议。