Pretrained language models have demonstrated extraordinary capabilities in language generation. However, real-world tasks often require controlling the distribution of generated text in order to mitigate bias, promote fairness, and achieve personalization. Existing techniques for controlling the distribution of generated text only work with quantified distributions, which require pre-defined categories, proportions of the distribution, or an existing corpus following the desired distributions. However, many important distributions, such as personal preferences, are unquantified. In this work, we tackle the problem of generating text following arbitrary distributions (quantified and unquantified) by proposing Nano, a few-shot human-in-the-loop training algorithm that continuously learns from human feedback. Nano achieves state-of-the-art results on single topic/attribute as well as quantified distribution control compared to previous works. We also show that Nano is able to learn unquantified distributions, achieves personalization, and captures differences between different individuals' personal preferences with high sample efficiency.
翻译:预训练语言模型在语言生成方面展现了非凡能力。然而,现实任务通常需要控制生成文本的分布,以减轻偏见、促进公平并实现个性化。现有控制生成文本分布的技术仅适用于可量化分布,这要求预先定义类别、分布比例或存在符合目标分布的现成语料库。但许多重要分布(如个人偏好)是不可量化的。本研究通过提出Nano算法,解决了遵循任意分布(可量化与不可量化)的文本生成问题。Nano是一种基于少样本人在环训练的算法,可持续从人类反馈中学习。与以往研究相比,Nano在单主题/属性控制及可量化分布控制任务上均取得了最优结果。我们同时证明,Nano能够学习不可量化分布,实现个性化,并以高样本效率捕捉不同个体间个人偏好的差异。