Small Language Model Can Self-correct

Generative Language Models (LMs) such as ChatGPT have exhibited remarkable performance across various downstream tasks. Nevertheless, one of their most prominent drawbacks is generating inaccurate or false information with a confident tone. Previous studies have devised sophisticated pipelines and prompts to induce large LMs to exhibit the capability for self-correction. However, large LMs are explicitly prompted to verify and modify its answers separately rather than completing all steps spontaneously like humans. Moreover, these complex prompts are extremely challenging for small LMs to follow. In this paper, we introduce the \underline{I}ntrinsic \underline{S}elf-\underline{C}orrection (ISC) in generative language models, aiming to correct the initial output of LMs in a self-triggered manner, even for those small LMs with 6 billion parameters. Specifically, we devise a pipeline for constructing self-correction data and propose Partial Answer Masking (PAM), aiming to endow the model with the capability for intrinsic self-correction through fine-tuning. We conduct experiments using LMs with parameters sizes ranging from 6 billion to 13 billion in two tasks, including commonsense reasoning and factual knowledge reasoning. Our experiments demonstrate that the outputs generated using ISC outperform those generated without self-correction. We believe that the output quality of even small LMs can be further improved by empowering them with the ability to intrinsic self-correct.

翻译：生成式语言模型（如ChatGPT）在各类下游任务中展现了卓越性能。然而，其最显著的缺陷之一是会以自信的口吻生成不准确或错误的信息。先前研究设计了复杂的流程和提示，以促使大型语言模型展示自我纠正能力。然而，大型语言模型被明确提示要求分别验证和修改其答案，而非像人类一样自发完成所有步骤。此外，这些复杂提示对小型语言模型而言极难遵循。本文提出生成式语言模型中的内源自纠正（ISC）方法，旨在以自触发方式纠正语言模型的初始输出，即使对于仅有60亿参数的小型语言模型也同样适用。具体而言，我们设计了一套构建自纠正数据的流程，并提出部分答案掩码（PAM）策略，旨在通过微调赋予模型内源自纠正能力。我们使用参数量从60亿到130亿不等的语言模型，在常识推理和事实知识推理两项任务中开展实验。实验结果表明，采用ISC生成的输出优于未进行自我纠正的输出。我们相信，即使小型语言模型，通过赋予其内源自纠正能力，其输出质量也能得到进一步提升。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

专知会员服务

13+阅读 · 2022年3月12日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日