Finding and selecting new and interesting problems to solve is at the heart of curiosity, science and innovation. We here study automated problem generation in the context of the open-ended space of python programming puzzles. Existing generative models often aim at modeling a reference distribution without any explicit diversity optimization. Other methods explicitly optimizing for diversity do so either in limited hand-coded representation spaces or in uninterpretable learned embedding spaces that may not align with human perceptions of interesting variations. With ACES (Autotelic Code Exploration via Semantic descriptors), we introduce a new autotelic generation method that leverages semantic descriptors produced by a large language model (LLM) to directly optimize for interesting diversity, as well as few-shot-based generation. Each puzzle is labeled along 10 dimensions, each capturing a programming skill required to solve it. ACES generates and pursues novel and feasible goals to explore that abstract semantic space, slowly discovering a diversity of solvable programming puzzles in any given run. Across a set of experiments, we show that ACES discovers a richer diversity of puzzles than existing diversity-maximizing algorithms as measured across a range of diversity metrics. We further study whether and in which conditions this diversity can translate into the successful training of puzzle solving models.
翻译:寻找并选择新颖有趣的问题进行求解是好奇心、科学与创新的核心。本文在Python编程谜题这一开放空间背景下研究自动化问题生成。现有生成模型通常旨在建模参考分布,缺乏显式的多样性优化。而其他显式优化多样性的方法要么局限于手工编码的有限表示空间,要么在可能无法与人类对有趣变体的感知对齐的不可解释学习嵌入空间中进行。我们提出ACES(基于语义描述符的自驱代码探索)这一新型自驱生成方法,利用大型语言模型产生的语义描述符直接优化趣味多样性,同时结合少样本生成。每个谜题沿10个维度标注,每个维度捕捉求解所需的一项编程技能。ACES生成并追求探索该抽象语义空间的新颖可行目标,在任意运行中逐步发现多样化的可解编程谜题。通过一系列实验,我们证明ACES比现有多样性最大化算法能发现更丰富的谜题多样性(以多项多样性指标衡量)。我们进一步研究这种多样性在何种条件下能转化为谜题求解模型的有效训练。