Kleinberg and Mullainathan recently proposed a formal framework for studying the phenomenon of language generation, called language generation in the limit. In this model, an adversary gives an enumeration of example strings from an unknown target language, and the algorithm is tasked with correctly generating unseen strings from the target language within finite time. Refined notions of non-uniform and uniform generation were later introduced by Li, Raman, and Tewari (2025), and a noisy model was introduced by Raman and Raman (2025), which allows the adversary to insert extraneous strings. A natural question in the noisy model is to quantify the effect of noise, by studying the impact of each additional extraneous string. We show two complementary results in this setting. We first show that for both uniform and non-uniform generation, a single noisy string strictly reduces the set of collections that can be generated, thus answering an open question in Raman and Raman (2025). Then, we show for both uniform and non-uniform generation that generation with a single noisy string is equivalent to generation with any finite amount of noise, sharply contrasting with the strict hierarchy for noisy generation in the limit shown by Bai, Panigrahi, and Zhang (2026). Finally, we leverage our previous results to provide the first known characterization for non-uniform noise-dependent generatability.
翻译:Kleinberg与Mullainathan近期提出了一个研究语言生成现象的形式化框架——极限语言生成。在该模型中,对手从未知目标语言中给出示例字符串的枚举,算法需在有限时间内正确生成目标语言中未见过的字符串。Li、Raman与Tewari(2025)随后引入了非一致生成与一致生成的精细概念,Raman与Raman(2025)则提出了含噪模型,允许对手插入无关字符串。含噪模型中的一个自然问题是通过研究每个额外无关字符串的影响来量化噪声效应。我们在此框架下展示两个互补性结果。首先证明,对于一致生成与非一致生成,单个含噪字符串严格减少了可生成的集合族,由此回答了Raman与Raman(2025)中的开放问题。其次表明,在一致生成与非一致生成中,单个含噪字符串的生成等价于任意有限噪声量的生成,这与Bai、Panigrahi与Zhang(2026)所展示的极限含噪生成中的严格层级结构形成鲜明对比。最后,我们利用前述结果为非一致噪声依赖性可生成性提供了首个已知的表征。