Applying language models to natural language processing tasks typically relies on the representations in the final model layer, as intermediate hidden layer representations are presumed to be less informative. In this work, we argue that due to the gradual improvement across model layers, additional information can be gleaned from the contrast between higher and lower layers during inference. Specifically, in choosing between the probable next token predictions of a generative model, the predictions of lower layers can be used to highlight which candidates are best avoided. We propose a novel approach that utilizes the contrast between layers to improve text generation outputs, and show that it mitigates degenerative behaviors of the model in open-ended generation, significantly improving the quality of generated texts. Furthermore, our results indicate that contrasting between model layers at inference time can yield substantial benefits to certain aspects of general language model capabilities, more effectively extracting knowledge during inference from a given set of model parameters.
翻译:将语言模型应用于自然语言处理任务通常依赖于最终模型层的表示,因为中间隐藏层的表示被认为信息量较少。在这项工作中,我们提出,由于模型各层的逐步改进,在推理过程中,较高层与较低层之间的对比可以获取额外信息。具体而言,在生成模型对下一个可能词元的预测中进行选择时,较低层的预测可用于突出哪些候选词元应被避免。我们提出了一种新颖方法,利用层间对比来改进文本生成输出,并证明该方法能缓解模型在开放式生成中的退化行为,显著提升生成文本的质量。此外,我们的结果表明,在推理时对模型层进行对比,可以为通用语言模型能力的某些方面带来显著收益,从而更有效地从给定模型参数中提取推理过程中的知识。