Program synthesis methods aim to automatically generate programs restricted to a language that can explain a given specification of input-output pairs. While purely symbolic approaches suffer from a combinatorial search space, recent methods leverage neural networks to learn distributions over program structures to narrow this search space significantly, enabling more efficient search. However, for challenging problems, it remains difficult to train models to perform program synthesis in one shot, making test-time search essential. Most neural methods lack structured search mechanisms during inference, relying instead on stochastic sampling or gradient updates, which can be inefficient. In this work, we propose the Latent Program Network (LPN), a general algorithm for program induction that learns a distribution over latent programs in a continuous space, enabling efficient search and test-time adaptation. We explore how to train these networks to optimize for test-time computation and demonstrate the use of gradient-based search both during training and at test time. We evaluate LPN on ARC-AGI, a program synthesis benchmark that evaluates performance by generalizing programs to new inputs rather than explaining the underlying specification. We show that LPN can generalize beyond its training distribution and adapt to unseen tasks by utilizing test-time computation, outperforming algorithms without test-time adaptation mechanisms.
翻译:程序合成方法旨在自动生成受限于某种语言的程序,该语言能够解释给定的输入-输出对规范。纯符号方法受限于组合搜索空间,而近期方法利用神经网络学习程序结构上的分布,从而显著缩小搜索空间,实现更高效的搜索。然而,对于具有挑战性的问题,训练模型以单次执行程序合成仍然困难,这使得测试时搜索变得至关重要。大多数神经方法在推理过程中缺乏结构化搜索机制,转而依赖随机采样或梯度更新,这可能效率低下。在本研究中,我们提出潜在程序网络(LPN),这是一种用于程序归纳的通用算法,它学习连续空间中潜在程序上的分布,从而实现高效搜索和测试时适应。我们探讨了如何训练这些网络以优化测试时计算,并展示了在训练和测试时使用基于梯度的搜索。我们在ARC-AGI上评估LPN,这是一个程序合成基准,通过将程序泛化到新输入而非解释底层规范来评估性能。我们证明,LPN能够通过利用测试时计算,泛化到其训练分布之外并适应未见任务,其表现优于没有测试时适应机制的算法。