Constructed languages (conlangs) such as Esperanto and Quenya have played diverse roles in art, philosophy, and international communication. Meanwhile, foundation models have revolutionized creative generation in text, images, and beyond. In this work, we leverage modern LLMs as computational creativity aids for end-to-end conlang creation. We introduce ConlangCrafter, a multi-hop pipeline that decomposes language design into modular stages -- phonology, morphology, syntax, lexicon generation, and translation. At each stage, our method leverages LLMs' metalinguistic reasoning capabilities, injecting randomness to encourage diversity and leveraging self-refinement feedback to encourage consistency in the emerging language description. We construct a novel, scalable evaluation framework for this task, evaluating metrics measuring consistency and typological diversity. Automatic and manual evaluations demonstrate ConlangCrafter's ability to produce coherent and varied conlangs without human linguistic expertise.
翻译:人造语言(如世界语、昆雅语)在艺术、哲学及国际交流中扮演着多元角色。与此同时,基础模型已在文本、图像等领域的创意生成中引发革命。本研究利用现代大语言模型作为计算创造力辅助工具,实现端到端的人造语言创建。我们提出ConlangCrafter,一个将语言设计分解为音系学、形态学、句法学、词汇生成和翻译等模块化阶段的多跳流水线。在每个阶段,该方法利用大语言模型的元语言推理能力,通过注入随机性增强多样性,并采用自优化反馈机制确保新兴语言描述的一致性。我们为该任务构建了新颖的可扩展评估框架,通过衡量一致性与类型学多样性的指标进行评估。自动与人工评估均表明,ConlangCrafter无需人类语言学专业知识即可生成连贯且多样化的人造语言。