Recent developments in large language models (LLMs) have been impressive. However, these models sometimes show inconsistencies and problematic behavior, such as hallucinating facts, generating flawed code, or creating offensive and toxic content. Unlike these models, humans typically utilize external tools to cross-check and refine their initial content, like using a search engine for fact-checking, or a code interpreter for debugging. Inspired by this observation, we introduce a framework called CRITIC that allows LLMs, which are essentially "black boxes" to validate and progressively amend their own outputs in a manner similar to human interaction with tools. More specifically, starting with an initial output, CRITIC interacts with appropriate tools to evaluate certain aspects of the text, and then revises the output based on the feedback obtained during this validation process. Comprehensive evaluations involving free-form question answering, mathematical program synthesis, and toxicity reduction demonstrate that CRITIC consistently enhances the performance of LLMs. Meanwhile, our research highlights the crucial importance of external feedback in promoting the ongoing self-improvement of LLMs.
翻译:近年来,大语言模型的发展令人瞩目。然而,这些模型有时会表现出不一致性和问题行为,例如捏造事实、生成有缺陷的代码或创建冒犯性和有害的内容。与这些模型不同,人类通常会利用外部工具交叉检查并完善其初始内容,例如使用搜索引擎进行事实核查,或使用代码解释器进行调试。受此观察启发,我们引入了一个名为CRITIC的框架,该框架允许本质上是"黑箱"的大语言模型以类似于人类与工具交互的方式验证并逐步修正自身输出。具体来说,从初始输出出发,CRITIC与适当的工具交互以评估文本的特定方面,然后根据此验证过程中获得的反馈来修正输出。涉及自由形式问答、数学程序综合和毒性降低的综合评估表明,CRITIC始终能提升大语言模型的性能。同时,我们的研究凸显了外部反馈在促进大语言模型持续自我改进中的关键重要性。