How did humanity coax mathematics from the aether? We explore the Platonic view that mathematics can be discovered from its axioms - a game of conjecture and proof. We describe Minimo (Mathematics from Intrinsic Motivation): an agent that jointly learns to pose challenging problems for itself (conjecturing) and solve them (theorem proving). Given a mathematical domain axiomatized in dependent type theory, we first combine methods for constrained decoding and type-directed synthesis to sample valid conjectures from a language model. Our method guarantees well-formed conjectures by construction, even as we start with a randomly initialized model. We use the same model to represent a policy and value function for guiding proof search. Our agent targets generating hard but provable conjectures - a moving target, since its own theorem proving ability also improves as it trains. We propose novel methods for hindsight relabeling on proof search trees to significantly improve the agent's sample efficiency in both tasks. Experiments on 3 axiomatic domains (propositional logic, arithmetic and group theory) demonstrate that our agent can bootstrap from only the axioms, self-improving in generating true and challenging conjectures and in finding proofs.
翻译:人类如何从虚无中催生出数学?我们探讨了柏拉图式的观点:数学可以从其公理中被发现——这是一场猜想与证明的游戏。我们描述了Minimo(源于内在动机的数学):一个能够协同学习如何为自己提出具有挑战性的问题(猜想)并解决它们(定理证明)的智能体。给定一个在依赖类型论中公理化的数学领域,我们首先结合约束解码和类型导向合成的方法,从语言模型中采样有效的猜想。我们的方法在构造上保证了猜想的良构性,即使是从随机初始化的模型开始。我们使用同一模型来表示指导证明搜索的策略函数和价值函数。我们的智能体以生成困难但可证明的猜想为目标——这是一个动态目标,因为其自身的定理证明能力也会随着训练而提升。我们提出了在证明搜索树上进行事后重标记的新方法,以显著提高智能体在这两项任务中的样本效率。在三个公理化领域(命题逻辑、算术和群论)上的实验表明,我们的智能体能够仅从公理出发进行自我引导,在生成真实且具有挑战性的猜想以及寻找证明方面实现自我改进。