While the ecosystem of Lean and Mathlib has enjoyed celebrated success in formal mathematical reasoning with the help of large language models (LLMs), the absence of many folklore lemmas in Mathlib remains a persistent barrier that limits Lean's usability as an everyday tool for mathematicians like \LaTeX{} or Maple. To address this, we introduce MathlibLemma, a modular LLM-based pipeline for automated folklore-lemma mining: the discovery, formalization, and proving of reusable intermediate facts that mathematicians often take for granted but that are not always present in formal libraries. At its core, MathlibLemma proactively mines the missing connective tissue of mathematics. The pipeline produces a verified library of folklore-style lemmas, including 1,506 Lean-checked proofs that pass a proof-bypass screen; a small curated pilot subset has also been merged into Mathlib, providing external evidence that selected outputs can meet expert library standards. Leveraging this pipeline, we further construct the MathlibLemma benchmark, a suite of 4,028 non-trivial type-checked Lean statements spanning a broad range of mathematical domains. By transforming the role of LLMs from passive consumers to active contributors, this work takes a step toward AI-assisted expansion of formal mathematical libraries.
翻译:尽管基于大语言模型的Lean与Mathlib生态系统在形式化数学推理领域取得了令人瞩目的成功,但Mathlib中大量民间引理的缺失仍是一个持续存在的障碍,限制了Lean像LaTeX或Maple那样成为数学家日常工具的可能性。为解决这一问题,我们提出MathlibLemma——一个基于大语言模型的模块化流水线,用于自动挖掘民间引理:发现、形式化并证明数学家通常视为理所当然却常未收录于形式化库中的可复用中间事实。其核心在于主动挖掘数学中缺失的"连接组织"。该流水线生成一个经验证的民间引理库,包含1,506个通过证明旁路筛选的Lean验证证明;其中一个小型精选子集已被合并至Mathlib,这为选定输出能够达到专家级库标准提供了外部证据。基于该流水线,我们进一步构建了MathlibLemma基准测试套件,包含4,028个跨越广泛数学领域的非平凡类型检查Lean语句。通过将大语言模型从被动消费者转变为主动贡献者,这项工作向人工智能辅助扩展形式化数学库迈出了关键一步。