On Language Generation in the Limit with Bounded Memory

We study language generation in the limit under bounded memory. In this task, a learner observes examples from an unknown target language one at a time and must eventually output only new valid examples. Prior work assumes access to the entire history, a strong assumption since realistic algorithms retain limited past information. Classical work in learning theory shows memory constraints dramatically alter learnability; we extend this to language generation. First, we study memoryless generators. Under a mild enumeration restriction, every countable collection of infinite languages remains generable without memory. Without this restriction, we exactly characterize when memoryless generation is possible. For finite collections, we characterize the optimal minimax density achievable by memoryless generators -- the best density guaranteed against any collection of a given size. This combinatorial bound relies on Sperner's theorem and symmetric chain decompositions. We further show that a sliding window of the last $W$ examples does not improve this worst-case density, whereas allowing it to store $b$ adaptively chosen past examples improves the achievable density for every $b \geq 1$. Finally, we revisit identification in the limit, where the learner must converge to a single correct hypothesis for the target language. We focus on its incremental variant, where the learner remembers only its previous guess. Here, although exact identification fails on a collection of just three languages, a mild relaxation requiring convergence to an ``approximate'' version of the target is achievable for every finite collection. These results show bounded memory affects these tasks differently: generation remains achievable for every countable collection, while density and identification are confined to finite collections, with guarantees weakening as the collection grows.

翻译：我们研究了有界记忆条件下的极限语言生成问题。在该任务中，学习者逐一观察来自未知目标语言的示例，并最终必须仅输出新的有效示例。先前的工作假设学习者能够访问全部历史信息，这是一个较强的假设，因为实际算法通常只能保留有限的过去信息。学习理论中的经典研究表明，记忆约束会显著改变可学习性；我们将此结论扩展至语言生成领域。首先，我们研究了无记忆生成器。在温和的枚举限制下，每一可数无限语言集合仍可在无记忆条件下生成。若无此限制，我们精确刻画了无记忆生成可行的条件。对于有限集合，我们刻画了无记忆生成器所能达到的最优极小极大密度——即针对任意给定规模的语言集合能保证的最佳密度。该组合界限依赖于Sperner定理和对称链分解。进一步研究表明，滑动窗口（仅保留最近$W$个示例）无法改进这种最坏情况下的密度，而允许存储$b$个自适应选择的过去示例则能提升对于任意$b \geq 1$的可达密度。最后，我们重新审视了极限识别问题，其中学习者必须收敛至目标语言的单一正确假设。我们聚焦于增量变体，即学习者仅记忆上一次猜测。在此设定下，尽管对仅含三种语言的集合进行精确识别已不可能，但通过温和松弛——要求收敛至目标语言的"近似"版本——可在每个有限语言集合上实现。这些结果表明，有界记忆对这些任务的影响不同：生成任务对每个可数集合仍可达成，而密度与识别任务则局限于有限集合，且其保证性随集合规模增大而减弱。