Constructed languages (conlangs) such as Esperanto and Quenya have played diverse roles in art, philosophy, and international communication. Meanwhile, foundation models have revolutionized creative generation in text, images, and beyond. In this work, we leverage modern LLMs as computational creativity aids for end-to-end conlang creation. We introduce ConlangCrafter, a multi-hop pipeline that decomposes language design into modular stages -- phonology, morphology, syntax, lexicon generation, and translation. At each stage, our method leverages LLMs' metalinguistic reasoning capabilities, injecting randomness to encourage diversity and leveraging self-refinement feedback to encourage consistency in the emerging language description. We construct a novel, scalable evaluation framework for this task, evaluating metrics measuring consistency and typological diversity. Automatic and manual evaluations demonstrate ConlangCrafter's ability to produce coherent and varied conlangs without human linguistic expertise.
翻译:人工构造语言(如世界语和昆雅语)在艺术、哲学与国际交流中发挥着多元作用。与此同时,基础模型已彻底改变了文本、图像等领域的创造性生成。本研究利用现代大语言模型作为端到端人工语言创建的计算创造力辅助工具。我们提出ConlangCrafter——一种多跳流水线,将语言设计分解为模块化阶段:音系学、形态学、句法学、词汇生成与翻译。在每个阶段,我们的方法利用大语言模型的元语言推理能力,通过引入随机性以促进多样性,并借助自优化反馈机制来增强新生语言描述的一致性。针对该任务,我们构建了一个新颖且可扩展的评估框架,从一致性和类型多样性两个维度建立量化指标。自动评估与人工评估结果均表明,ConlangCrafter能够在无需人类语言学专业知识的前提下,生成具有连贯性与多样性的高质量人工语言。