Recently, there has been a growing interest in the development of gradient-based sampling algorithms for text generation, especially in the context of controlled generation. However, there exists a lack of theoretically grounded and principled approaches for this task. In this paper, we take an important step toward building a principled approach for sampling from language models with gradient-based methods. We use discrete distributions given by language models to define densities and develop an algorithm based on Hamiltonian Monte Carlo to sample from them. We name our gradient-based technique Structured Voronoi Sampling (SVS). In an experimental setup where the reference distribution is known, we show that the empirical distribution of SVS samples is closer to the reference distribution compared to alternative sampling schemes. Furthermore, in a controlled generation task, SVS is able to generate fluent and diverse samples while following the control targets significantly better than other methods.
翻译:近期,基于梯度的采样算法在文本生成领域,特别是在受控生成场景中,引起了日益广泛的关注。然而,目前针对这一任务仍缺乏具有理论基础且系统化的处理方法。本文提出了一种基于梯度方法从语言模型中进行采样的系统性框架,并取得了重要进展。我们利用语言模型所定义的离散分布来构建密度函数,并基于哈密顿蒙特卡洛方法开发了一种采样算法。我们将该梯度技术命名为结构化沃罗诺伊采样(SVS)。在参考分布已知的实验设定下,我们证明SVS样本的经验分布相较于其他采样方案更接近参考分布。此外,在受控生成任务中,SVS能在生成流畅且多样化的样本的同时,显著优于其他方法,更好地遵循控制目标。