We introduce a frustratingly simple, super efficient and surprisingly effective decoding method, which we call Frustratingly Simple Decoding (FSD), for neural text generation. The idea behind FSD is straightforward: we build an anti-LM based on previously generated text and use this anti-LM to penalize future generation of what has been generated. The anti-LM can be implemented as simple as an n-gram language model or a vectorized variant. In this way, FSD introduces no extra model parameters and negligible computational overhead (FSD can be as fast as greedy search). Despite the simplicity, FSD is surprisingly effective; Experiments show that FSD can outperform the canonical methods to date (i.e., nucleus sampling) as well as several strong baselines that were proposed recently.
翻译:本文提出了一种出奇简单、超高效且效果惊人的解码方法,我们称之为"出奇简单解码"(FSD),用于神经文本生成。FSD的核心思想直截了当:基于已生成的文本构建一个反语言模型,并利用该反语言模型对后续生成已生成内容的概率进行惩罚。反语言模型可简化为n-gram语言模型或向量化变体实现。通过这种方式,FSD不引入额外模型参数,且计算开销可忽略不计(FSD的运算速度可与贪心搜索媲美)。尽管方法简单,FSD却展现了惊人效果:实验表明,FSD不仅优于当前主流方法(如核采样),还超越了近期提出的多个强基线模型。