The decoding algorithm is critical for open-ended text generation, transforming latent representations into coherent and meaningful outputs. This paper investigates the self-reinforcement effect in text generation and the effectiveness of a repetition penalty to mitigate it. However, determining the optimal repetition penalty value is challenging. To tackle this, we propose a forgetting mechanism that disregards distant tokens, reducing the burden of penalty selection. In addition, we introduce a length penalty to address overly short sentences caused by excessive penalties. Our penalty decoding approach incorporating three strategies helps resolve issues with sampling methods deviating from factual information. Experimental results demonstrate the efficacy of our approach in generating high-quality sentences resembling human output.
翻译:解码算法对开放文本生成至关重要,它将潜在表征转化为连贯且有意义的输出。本文研究了文本生成中的自我强化效应,以及使用重复惩罚来缓解该效应的有效性。然而,确定最优重复惩罚值具有挑战性。为解决此问题,我们提出了一种遗忘机制,该机制忽略远距离标记,从而降低惩罚选择的负担。此外,我们引入长度惩罚,以解决因过度惩罚导致的句子过短问题。我们融合这三种策略的惩罚解码方法,有助于解决采样方法偏离事实信息的问题。实验结果表明,我们的方法能够有效生成类似人类的高质量句子。