Modern language models can generate high-quality short texts. However, they often meander or are incoherent when generating longer texts. These issues arise from the next-token-only language modeling objective. Recent work in self-supervised learning suggests that models can learn good latent representations via contrastive learning, which can be effective for discriminative tasks. Our work analyzes the application of contrastive representations for generative tasks, like long text generation. We propose one approach for leveraging constrastive representations, which we call Time Control (TC). TC first learns a contrastive representation of the target text domain, then generates text by decoding from these representations. Compared to domain-specific methods and fine-tuning GPT2 across a variety of text domains, TC performs competitively to methods specific for learning sentence representations on discourse coherence. On long text generation settings, TC preserves the text structure both in terms of ordering (up to $+15\%$ better) and text length consistency (up to $+90\%$ better).
翻译:现代语言模型能够生成高质量的短文本,但在生成长文本时往往会出现内容偏离或不连贯的问题。这些缺陷源于仅基于下一个词的语言建模目标。近期自监督学习的研究表明,通过对比学习,模型能够学习有效的潜在表征,这对于判别式任务十分有效。我们的工作分析了对比表征在生成式任务(如长文本生成)中的应用。我们提出了一种利用对比表征的方法,称为时间控制(TC)。TC首先学习目标文本领域的对比表征,然后通过解码这些表征来生成文本。与特定领域的方法以及在不同文本领域上微调GPT2的方法相比,TC在语篇连贯性方面与专门学习句子表征的方法表现相当。在长文本生成设置中,TC在文本排序(提升高达15%)和文本长度一致性(提升高达90%)方面均能保持文本结构。