Recent developments in large language models (LLMs) have been impressive. However, these models sometimes show inconsistencies and problematic behavior, such as hallucinating facts, generating flawed code, or creating offensive and toxic content. Unlike these models, humans typically utilize external tools to cross-check and refine their initial content, like using a search engine for fact-checking, or a code interpreter for debugging. Inspired by this observation, we introduce a framework called CRITIC that allows LLMs, which are essentially "black boxes" to validate and progressively amend their own outputs in a manner similar to human interaction with tools. More specifically, starting with an initial output, CRITIC interacts with appropriate tools to evaluate certain aspects of the text, and then revises the output based on the feedback obtained during this validation process. Comprehensive evaluations involving free-form question answering, mathematical program synthesis, and toxicity reduction demonstrate that CRITIC consistently enhances the performance of LLMs. Meanwhile, our research highlights the crucial importance of external feedback in promoting the ongoing self-improvement of LLMs.
翻译:近期大型语言模型(LLM)的发展令人瞩目,但这些模型仍会表现出不一致性和问题行为,例如捏造事实、生成有缺陷的代码,或制造冒犯性和有害内容。与这些模型不同,人类通常会利用外部工具来交叉验证并完善自己的初始内容,例如使用搜索引擎进行事实核查,或借助代码解释器进行调试。受此观察启发,我们提出一个名为CRITIC的框架,该框架允许本质上为"黑箱"的LLM以类似于人类与工具交互的方式验证并逐步修正自身输出。具体而言,从初始输出出发,CRITIC与相应工具交互以评估文本的特定方面,然后根据验证过程中获得的反馈修正输出。涉及自由形式问答、数学程序合成和毒性降低的综合评估表明,CRITIC能够持续提升LLM的性能。同时,我们的研究凸显了外部反馈在促进LLM持续自我改进中的关键重要性。