While the ecosystem of Lean and Mathlib has enjoyed celebrated success in formal mathematical reasoning with the help of large language models (LLMs), the absence of many folklore lemmas in Mathlib remains a persistent barrier that limits Lean's usability as an everyday tool for mathematicians like LaTeX or Maple. To address this, we introduce MathlibLemma, the first LLM-based multi-agent system to automate the discovery and formalization of mathematical folklore lemmas. This framework constitutes our primary contribution, proactively mining the missing connective tissue of mathematics. Its efficacy is demonstrated by the production of a verified library of folklore lemmas, a subset of which has already been formally merged into the latest build of Mathlib, thereby validating the system's real-world utility and alignment with expert standards. Leveraging this pipeline, we further construct the MathlibLemma benchmark, a suite of 4,028 type-checked Lean statements spanning a broad range of mathematical domains. By transforming the role of LLMs from passive consumers to active contributors, this work establishes a constructive methodology for the self-evolution of formal mathematical libraries.
翻译:尽管在大型语言模型(LLM)的辅助下,Lean 与 Mathlib 生态系统在形式数学推理领域取得了显著成功,但 Mathlib 中仍缺失大量民间引理,这一持续存在的障碍限制了 Lean 作为日常数学工具(如 LaTeX 或 Maple)的可用性。为解决此问题,我们提出了 MathlibLemma——首个基于 LLM 的多智能体系统,用于自动化发现与形式化数学民间引理。该框架构成了我们的核心贡献,能够主动挖掘数学中缺失的“连接组织”。其有效性通过一个已验证的民间引理库得以证明,其中部分引理已正式合并至 Mathlib 的最新构建版本,从而验证了该系统在现实场景中的实用性与专家标准的一致性。基于此流程,我们进一步构建了 MathlibLemma 基准测试集——包含 4,028 个经过类型检查的 Lean 命题,涵盖广泛的数学领域。通过将 LLM 的角色从被动使用者转变为主动贡献者,本研究为形式数学库的自我演进建立了一种建设性方法论。