Recent developments in large language models (LLMs) have been impressive. However, these models sometimes show inconsistencies and problematic behavior, such as hallucinating facts, generating flawed code, or creating offensive and toxic content. Unlike these models, humans typically utilize external tools to cross-check and refine their initial content, like using a search engine for fact-checking, or a code interpreter for debugging. Inspired by this observation, we introduce a framework called CRITIC that allows LLMs, which are essentially "black boxes" to validate and progressively amend their own outputs in a manner similar to human interaction with tools. More specifically, starting with an initial output, CRITIC interacts with appropriate tools to evaluate certain aspects of the text, and then revises the output based on the feedback obtained during this validation process. Comprehensive evaluations involving free-form question answering, mathematical program synthesis, and toxicity reduction demonstrate that CRITIC consistently enhances the performance of LLMs. Meanwhile, our research highlights the crucial importance of external feedback in promoting the ongoing self-improvement of LLMs.
翻译:近年来,大语言模型的发展令人瞩目。然而,这些模型有时会表现出不一致性和问题行为,例如虚构事实、生成有缺陷的代码或产生冒犯性和有害内容。与这些模型不同,人类通常会借助外部工具来交叉验证并优化初始内容,例如使用搜索引擎进行事实核查,或使用代码解释器进行调试。受此启发,我们提出一个名为CRITIC的框架,该框架使大语言模型(本质上是“黑箱”)能够以类似于人类与工具交互的方式验证并逐步修正自身输出。具体而言,从初始输出开始,CRITIC与适当工具交互以评估文本的特定方面,然后根据验证过程中获得的反馈修正输出。涉及自由形式问答、数学程序综合和毒性减少的综合评估表明,CRITIC持续提升大语言模型的性能。同时,我们的研究强调了外部反馈在促进大语言模型持续自我改进中的关键重要性。