MaxProof: Scaling Mathematical Proof with Generative-Verifier RL and Population-Level Test-Time Scaling - 专知论文

会员服务 ·

0

缩放 · MoDELS · 数学 · 讲稿 · Engineering ·

MaxProof: Scaling Mathematical Proof with Generative-Verifier RL and Population-Level Test-Time Scaling

翻译：暂无翻译

Jiacheng Chen,Xinyu Zhang,Shunkai Zhang,Yanmohan Wang,Lin Li,Tiancheng Qin,Qin Wang,Zhengmao Zhu,Tianle Li,Jingyang Li,Zehan Li,Binyang Jiang,Jin Zhu,Han Ding,Fei Yu,Chenyu Du,Zijian Song,Jiayuan Song,Zhi Zhang,Yunan Huang,Weiyu Cheng,Pengyu Zhao,Yu Cheng

We present MaxProof, a population-level test-time scaling framework for competition-level mathematical proof in the MiniMax-M3 series. M3 first trains three proof-oriented capabilities -- proof generation, proof verification, and critique-conditioned proof repair -- using a defense-in-depth generative verifier engineered for low false-positive rate. These capabilities are merged into a single released M3 model. At test time, MaxProof treats the model as a generator, verifier, refiner, and ranker, searches over a population of candidate proofs, and returns one final proof through tournament selection. With MaxProof test-time scaling, the M3 model reaches 35/42 on IMO 2025 and 36/42 on USAMO 2026, exceeding the human gold-medal threshold on both.

翻译：暂无翻译

0

相关内容

ACL 2025 | 高效样本利用的大模型人类评估方法

ACL 2025 | 高效样本利用的大模型人类评估方法

专知会员服务

14+阅读 · 2025年5月22日

【CVPR 2022】跨模态检索的协同双流视觉-语言前训练模型，COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval

【CVPR 2022】跨模态检索的协同双流视觉-语言前训练模型，COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval

专知会员服务

13+阅读 · 2022年3月12日

【斯坦福大学博士论文】大规模和高维统计学习方法和算法，147页pdf， Large-scale and high-dimensional statistical learning methods and algorithms

专知会员服务

26+阅读 · 2020年6月13日

回顾机器学习公平的数学框架，Review of Mathematical frameworks for Fairness in Machine Learning

回顾机器学习公平的数学框架，Review of Mathematical frameworks for Fairness in Machine Learning

专知会员服务

39+阅读 · 2020年5月30日

【论文推荐】不同图像域弱监督语义分割的综合分析，A Comprehensive Analysis of Weakly-Supervised Semantic Segmentation in Different Image Domains

【论文推荐】不同图像域弱监督语义分割的综合分析，A Comprehensive Analysis of Weakly-Supervised Semantic Segmentation in Different Image Domains

专知会员服务

28+阅读 · 2019年12月27日

【Open AI】利用过程生成对强化学习进行基准测试（Leveraging Procedural Generation to Benchmark Reinforcement Learning）

【Open AI】利用过程生成对强化学习进行基准测试（Leveraging Procedural Generation to Benchmark Reinforcement Learning）

专知会员服务

10+阅读 · 2019年12月3日

【变分推断课件】Lectures on Variational Inference：Statistical Analysis of Variational Approximations（附带pdf）

【变分推断课件】Lectures on Variational Inference：Statistical Analysis of Variational Approximations（附带pdf）

专知会员服务

16+阅读 · 2019年11月30日

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

专知会员服务

24+阅读 · 2019年11月11日

【ICCV 2019 Workshop】Complete Dictionary Learning via L4-Norm Maximization over the Orthogonal Grou，加州大学伯克利分校马毅

【ICCV 2019 Workshop】Complete Dictionary Learning via L4-Norm Maximization over the Orthogonal Grou，加州大学伯克利分校马毅

专知会员服务

16+阅读 · 2019年10月31日

Influence Maximization: Integrating and Expanding Classical Algorithms into the Social Network Context [陈卫微软亚洲研究院] 2019年中国计算机大会机器学习与数据挖掘论坛

Influence Maximization: Integrating and Expanding Classical Algorithms into the Social Network Context [陈卫微软亚洲研究院] 2019年中国计算机大会机器学习与数据挖掘论坛

专知会员服务

10+阅读 · 2019年10月26日

[AAAI 2021]图到图：面向精确可解释的联机手写数学公式识别

[AAAI 2021]图到图：面向精确可解释的联机手写数学公式识别

专知

11+阅读 · 2021年2月19日

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

专知

21+阅读 · 2020年5月30日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

专知

19+阅读 · 2018年6月26日

【论文推荐】最新十篇度量学习相关论文—可量化表示、非线性度量学习、在线深度量学习、大间隔最近邻、判别深度度量、域自适应

【论文推荐】最新十篇度量学习相关论文—可量化表示、非线性度量学习、在线深度量学习、大间隔最近邻、判别深度度量、域自适应

专知

12+阅读 · 2018年5月18日

香港中大-商汤科技联合实验室AAAI录用论文详解：ST-GCN时空图卷积网络模型

香港中大-商汤科技联合实验室AAAI录用论文详解：ST-GCN时空图卷积网络模型

商汤科技

12+阅读 · 2018年2月11日

概率图模型体系：HMM、MEMM、CRF

概率图模型体系：HMM、MEMM、CRF

机器学习研究会

30+阅读 · 2018年2月10日

From Softmax to Sparsemax-ICML16（1）

From Softmax to Sparsemax-ICML16（1）

KingsGarden

74+阅读 · 2016年11月26日

面向数万处理器的有限元线性方程组与模态多级算法研究

国家自然科学基金

0+阅读 · 2015年12月31日

二维超材料中 Maxwell 方程组高阶 Nedelec 混合有限元超收敛研究

国家自然科学基金

0+阅读 · 2015年12月31日

分数随机微分方程的定性理论研究及其应用

国家自然科学基金

0+阅读 · 2015年12月31日

Filling问题的最优化原理及其求解方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

函数数据变换模型及降维方法的研究

国家自然科学基金

1+阅读 · 2015年12月31日

心理与教育测量中项目反应时间数据的统计建模及其应用

国家自然科学基金

0+阅读 · 2015年12月31日

分数阶非线性偏微分方程的相关数学问题

国家自然科学基金

0+阅读 · 2014年12月31日

高维时空场数据的层次张量建模与分析方法

国家自然科学基金

2+阅读 · 2014年12月31日

一些几何发展方程中的渐近分析研究

国家自然科学基金

0+阅读 · 2014年12月31日

一类Schrodinger-Maxwell 系统解的存在性与多解性研究

国家自然科学基金

0+阅读 · 2014年12月31日

Mask-Proof: An LLM-based Automated Data Curation Pipeline on Mathematical Proofs

Arxiv

0+阅读 · 6月13日

Second-Order Least Squares as a Special Case of the Polynomial Maximization Method

Arxiv

0+阅读 · 6月9日

Testing Equality of Conditional Distributions via Generative Models

Arxiv

0+阅读 · 6月5日

Unified theory of testing relevant hypotheses in functional time series

Arxiv

0+阅读 · 5月29日

EXOTIC: An Exact, Optimistic, Tree-Based Algorithm for Min-Max Optimization

Arxiv

0+阅读 · 5月23日

A Proof-Theoretic Study of Modal Logic

Arxiv

0+阅读 · 5月18日

Scale-Sensitive Shattering: Learnability and Evaluability at Optimal Scale

Arxiv

0+阅读 · 5月13日

MathConstraint: Automated Generation of Verified Combinatorial Reasoning Instances for LLMs

Arxiv

0+阅读 · 5月8日

Maximum Cuts and Fractional Cut Covers: A Computational Study of a Randomized Semidefinite Programming Approach

Arxiv

0+阅读 · 4月19日

Quantum Property Testing for Bounded-Degree Directed Graphs

Arxiv

0+阅读 · 4月9日

VIP会员

文章信息

相关主题

最新内容

《通用大语言模型：无人机指挥与控制接口》最新40页

《通用大语言模型：无人机指挥与控制接口》最新40页

专知会员服务

0+阅读 · 18分钟前

《通过小型无人机系统将情报能力“作战化”》

《通过小型无人机系统将情报能力“作战化”》

专知会员服务

0+阅读 · 23分钟前

《神经安全型有人–无人协同：面向认知自适应作战能力的参考架构》

《神经安全型有人–无人协同：面向认知自适应作战能力的参考架构》

专知会员服务

0+阅读 · 37分钟前

《在指挥链中通过多准则决策分析传达指挥官意图：空战实验》

《在指挥链中通过多准则决策分析传达指挥官意图：空战实验》

专知会员服务

15+阅读 · 6月15日

消耗优势：美军的“精确规模化”概念

消耗优势：美军的“精确规模化”概念

专知会员服务

7+阅读 · 6月15日

五角大楼的AI优先战略及其对现代战争的启示：来自与伊朗冲突的经验教训

五角大楼的AI优先战略及其对现代战争的启示：来自与伊朗冲突的经验教训

专知会员服务

8+阅读 · 6月15日

《网络空间兵棋推演：挑战、局限性与混合路径》报告

《网络空间兵棋推演：挑战、局限性与混合路径》报告

专知会员服务

8+阅读 · 6月15日

《离线语言支持系统：面向空战战术决策》

《离线语言支持系统：面向空战战术决策》

专知会员服务

8+阅读 · 6月15日

《以通信为中心的6G–LLM架构：面向可扩展的战术自主防御车辆网络》

《以通信为中心的6G–LLM架构：面向可扩展的战术自主防御车辆网络》

专知会员服务

6+阅读 · 6月15日

ICML 2026｜ECA：面向开放式图文生成的高效持续对齐

ICML 2026｜ECA：面向开放式图文生成的高效持续对齐

专知会员服务

6+阅读 · 6月14日

可信智能体AI综述：安全、鲁棒性、隐私与系统安全

可信智能体AI综述：安全、鲁棒性、隐私与系统安全

专知会员服务

6+阅读 · 6月14日

俄乌战场地面机器人如何改写战争规则

俄乌战场地面机器人如何改写战争规则

专知会员服务

9+阅读 · 6月14日

美国海军研究生院第23届年度采购研究研讨会与创新峰会：主题“加速作战能力”，附会议报告论文集1300页

美国海军研究生院第23届年度采购研究研讨会与创新峰会：主题“加速作战能力”，附会议报告论文集1300页

专知会员服务

13+阅读 · 6月14日

《新空中力量概念：来自敏捷战斗运用的启示》2026最新50页报告

《新空中力量概念：来自敏捷战斗运用的启示》2026最新50页报告

专知会员服务

14+阅读 · 6月14日

《无人水面艇文献综述与结构设计》135页

《无人水面艇文献综述与结构设计》135页

专知会员服务

16+阅读 · 6月13日

相关VIP内容

ACL 2025 | 高效样本利用的大模型人类评估方法

ACL 2025 | 高效样本利用的大模型人类评估方法

专知会员服务

14+阅读 · 2025年5月22日

【CVPR 2022】跨模态检索的协同双流视觉-语言前训练模型，COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval

【CVPR 2022】跨模态检索的协同双流视觉-语言前训练模型，COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval

专知会员服务

13+阅读 · 2022年3月12日

【斯坦福大学博士论文】大规模和高维统计学习方法和算法，147页pdf， Large-scale and high-dimensional statistical learning methods and algorithms

专知会员服务

26+阅读 · 2020年6月13日

回顾机器学习公平的数学框架，Review of Mathematical frameworks for Fairness in Machine Learning

回顾机器学习公平的数学框架，Review of Mathematical frameworks for Fairness in Machine Learning

专知会员服务

39+阅读 · 2020年5月30日

【论文推荐】不同图像域弱监督语义分割的综合分析，A Comprehensive Analysis of Weakly-Supervised Semantic Segmentation in Different Image Domains

【论文推荐】不同图像域弱监督语义分割的综合分析，A Comprehensive Analysis of Weakly-Supervised Semantic Segmentation in Different Image Domains

专知会员服务

28+阅读 · 2019年12月27日

【Open AI】利用过程生成对强化学习进行基准测试（Leveraging Procedural Generation to Benchmark Reinforcement Learning）

【Open AI】利用过程生成对强化学习进行基准测试（Leveraging Procedural Generation to Benchmark Reinforcement Learning）

专知会员服务

10+阅读 · 2019年12月3日

【变分推断课件】Lectures on Variational Inference：Statistical Analysis of Variational Approximations（附带pdf）

【变分推断课件】Lectures on Variational Inference：Statistical Analysis of Variational Approximations（附带pdf）

专知会员服务

16+阅读 · 2019年11月30日

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

专知会员服务

24+阅读 · 2019年11月11日

【ICCV 2019 Workshop】Complete Dictionary Learning via L4-Norm Maximization over the Orthogonal Grou，加州大学伯克利分校马毅

【ICCV 2019 Workshop】Complete Dictionary Learning via L4-Norm Maximization over the Orthogonal Grou，加州大学伯克利分校马毅

专知会员服务

16+阅读 · 2019年10月31日

Influence Maximization: Integrating and Expanding Classical Algorithms into the Social Network Context [陈卫微软亚洲研究院] 2019年中国计算机大会机器学习与数据挖掘论坛

Influence Maximization: Integrating and Expanding Classical Algorithms into the Social Network Context [陈卫微软亚洲研究院] 2019年中国计算机大会机器学习与数据挖掘论坛

专知会员服务

10+阅读 · 2019年10月26日

热门VIP内容

开通专知VIP会员享更多权益服务

《通过小型无人机系统将情报能力“作战化”》

《在指挥链中通过多准则决策分析传达指挥官意图：空战实验》

《通用大语言模型：无人机指挥与控制接口》最新40页

《神经安全型有人–无人协同：面向认知自适应作战能力的参考架构》

相关资讯

[AAAI 2021]图到图：面向精确可解释的联机手写数学公式识别

[AAAI 2021]图到图：面向精确可解释的联机手写数学公式识别

专知

11+阅读 · 2021年2月19日

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

专知

21+阅读 · 2020年5月30日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

专知

19+阅读 · 2018年6月26日

【论文推荐】最新十篇度量学习相关论文—可量化表示、非线性度量学习、在线深度量学习、大间隔最近邻、判别深度度量、域自适应

【论文推荐】最新十篇度量学习相关论文—可量化表示、非线性度量学习、在线深度量学习、大间隔最近邻、判别深度度量、域自适应

专知

12+阅读 · 2018年5月18日

香港中大-商汤科技联合实验室AAAI录用论文详解：ST-GCN时空图卷积网络模型

香港中大-商汤科技联合实验室AAAI录用论文详解：ST-GCN时空图卷积网络模型

商汤科技

12+阅读 · 2018年2月11日

概率图模型体系：HMM、MEMM、CRF

概率图模型体系：HMM、MEMM、CRF

机器学习研究会

30+阅读 · 2018年2月10日

From Softmax to Sparsemax-ICML16（1）

From Softmax to Sparsemax-ICML16（1）

KingsGarden

74+阅读 · 2016年11月26日

相关论文

Mask-Proof: An LLM-based Automated Data Curation Pipeline on Mathematical Proofs

Arxiv

0+阅读 · 6月13日

Second-Order Least Squares as a Special Case of the Polynomial Maximization Method

Arxiv

0+阅读 · 6月9日

Testing Equality of Conditional Distributions via Generative Models

Arxiv

0+阅读 · 6月5日

Unified theory of testing relevant hypotheses in functional time series

Arxiv

0+阅读 · 5月29日

EXOTIC: An Exact, Optimistic, Tree-Based Algorithm for Min-Max Optimization

Arxiv

0+阅读 · 5月23日

A Proof-Theoretic Study of Modal Logic

Arxiv

0+阅读 · 5月18日

Scale-Sensitive Shattering: Learnability and Evaluability at Optimal Scale

Arxiv

0+阅读 · 5月13日

MathConstraint: Automated Generation of Verified Combinatorial Reasoning Instances for LLMs

Arxiv

0+阅读 · 5月8日

Maximum Cuts and Fractional Cut Covers: A Computational Study of a Randomized Semidefinite Programming Approach

Arxiv

0+阅读 · 4月19日

Quantum Property Testing for Bounded-Degree Directed Graphs

Arxiv

0+阅读 · 4月9日

相关基金

面向数万处理器的有限元线性方程组与模态多级算法研究

国家自然科学基金

0+阅读 · 2015年12月31日

二维超材料中 Maxwell 方程组高阶 Nedelec 混合有限元超收敛研究

国家自然科学基金

0+阅读 · 2015年12月31日

分数随机微分方程的定性理论研究及其应用

国家自然科学基金

0+阅读 · 2015年12月31日

Filling问题的最优化原理及其求解方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

函数数据变换模型及降维方法的研究

国家自然科学基金

1+阅读 · 2015年12月31日

心理与教育测量中项目反应时间数据的统计建模及其应用

国家自然科学基金

0+阅读 · 2015年12月31日

分数阶非线性偏微分方程的相关数学问题

国家自然科学基金

0+阅读 · 2014年12月31日

高维时空场数据的层次张量建模与分析方法

国家自然科学基金

2+阅读 · 2014年12月31日

一些几何发展方程中的渐近分析研究

国家自然科学基金

0+阅读 · 2014年12月31日

一类Schrodinger-Maxwell 系统解的存在性与多解性研究

国家自然科学基金

0+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员