Previous work in phonetically-grounded language generation has mainly focused on domains such as lyrics and poetry. In this paper, we present work on the generation of tongue twisters - a form of language that is required to be phonetically conditioned to maximise sound overlap, whilst maintaining semantic consistency with an input topic, and still being grammatically correct. We present \textbf{TwistList}, a large annotated dataset of tongue twisters, consisting of 2.1K+ human-authored examples. We additionally present several benchmark systems (referred to as TwisterMisters) for the proposed task of tongue twister generation, including models that both do and do not require training on in-domain data. We present the results of automatic and human evaluation to demonstrate the performance of existing mainstream pre-trained models in this task with limited (or no) task specific training and data, and no explicit phonetic knowledge. We find that the task of tongue twister generation is challenging for models under these conditions, yet some models are still capable of generating acceptable examples of this language type.
翻译:此前基于语音的语言生成研究主要聚焦于歌词和诗歌等领域。本文提出绕口令生成任务——这类语言形式需在语音层面进行条件化以最大化声音重叠,同时保持与输入主题的语义一致性,并确保语法正确性。我们发布了大型标注绕口令数据集 \textbf{TwistList},包含 2100 余条人工撰写示例。此外,针对绕口令生成这一新任务,我们构建了多个基准系统(称为 TwisterMisters),涵盖需要与不需要领域数据训练两类模型。通过自动评估与人工评估,我们展示了现有主流预训练模型在该任务中(在有限或无任务特定训练数据、无显式语音知识条件下)的表现。研究发现,绕口令生成任务对当前模型具有挑战性,但部分模型仍能生成可接受的该类语言示例。