Previous work in phonetically-grounded language generation has mainly focused on domains such as lyrics and poetry. In this paper, we present work on the generation of tongue twisters - a form of language that is required to be phonetically conditioned to maximise sound overlap, whilst maintaining semantic consistency with an input topic, and still being grammatically correct. We present \textbf{TwistList}, a large annotated dataset of tongue twisters, consisting of 2.1K+ human-authored examples. We additionally present several benchmark systems (referred to as TwisterMisters) for the proposed task of tongue twister generation, including models that both do and do not require training on in-domain data. We present the results of automatic and human evaluation to demonstrate the performance of existing mainstream pre-trained models in this task with limited (or no) task specific training and data, and no explicit phonetic knowledge. We find that the task of tongue twister generation is challenging for models under these conditions, yet some models are still capable of generating acceptable examples of this language type.
翻译:先前在基于语音的语言生成领域的研究主要集中在歌词和诗歌等文本形式。本文探讨了绕口令的生成任务——这种语言形式需要在维持与输入主题语义一致性的前提下,通过语音条件约束最大化语音重叠,同时保证语法正确性。我们提出了 **TwistList** 数据集,这是首个大规模带标注的绕口令数据集,包含2100余条人工撰写的示例。针对绕口令生成这一新任务,我们还构建了多个基准系统(统称 TwisterMisters),包括需要领域内数据训练和无需此类训练的模型。通过自动评估与人工评估,我们展示了现有主流预训练模型在该任务中的表现——这些模型在有限(或零)任务专项训练数据、无显式语音知识的情况下,其性能表现。研究发现,在这些约束条件下,绕口令生成对模型而言极具挑战性,但部分模型仍能生成可接受的该语言类型示例。