基于Gemini的半自主数学发现：以埃尔德什问题为例 (Semi-Autonomous Mathematics Discovery with Gemini: A Case Study on the Erdős Problems)

Tony Feng,Trieu Trinh,Garrett Bingham,Jiwon Kang,Shengtong Zhang,Sang-hyun Kim,Kevin Barreto,Carl Schildkraut,Junehyuk Jung,Jaehyeon Seo,Carlo Pagano,Yuri Chervonyi,Dawsen Hwang,Kaiying Hou,Sergei Gukov,Cheng-Chiang Tsai,Hyunwoo Choi,Youngbeom Jin,Wei-Yuan Li,Hao-An Wu,Ruey-An Shiu,Yu-Sheng Shih,Quoc V. Le,Thang Luong

from arxiv, Correct some typos and wordings

We present a case study in semi-autonomous mathematics discovery, using Gemini to systematically evaluate 700 conjectures labeled 'Open' in Bloom's Erdős Problems database. We employ a hybrid methodology: AI-driven natural language verification to narrow the search space, followed by human expert evaluation to gauge correctness and novelty. We address 13 problems that were marked 'Open' in the database: 5 through seemingly novel autonomous solutions, and 8 through identification of previous solutions in the existing literature. Our findings suggest that the 'Open' status of the problems was through obscurity rather than difficulty. We also identify and discuss issues arising in applying AI to math conjectures at scale, highlighting the difficulty of literature identification and the risk of ''subconscious plagiarism'' by AI. We reflect on the takeaways from AI-assisted efforts on the Erdős Problems.

翻译：本文提出一项半自主数学发现的案例研究，利用Gemini系统性地评估Bloom埃尔德什问题数据库中标记为"未解决"的700个猜想。我们采用混合方法：首先通过AI驱动的自然语言验证缩小搜索范围，随后由人类专家评估其正确性与新颖性。我们处理了数据库中标记为"未解决"的13个问题：其中5个通过看似新颖的自主解决方案得到解决，另外8个通过识别现有文献中的已有解决方案得以完成。研究结果表明，这些问题之所以呈现"未解决"状态，更多源于其隐蔽性而非难度。我们还识别并讨论了大规模应用AI处理数学猜想时产生的问题，重点指出文献识别的困难性以及AI可能产生"潜意识抄袭"的风险。最后，我们对AI辅助解决埃尔德什问题的工作进行了反思与总结。

相关内容

关注 7093

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

不可错过！《药物发现中的应用数学和信息学》，附Slides

专知会员服务

23+阅读 · 2022年12月21日

强化学习发现矩阵乘法算法，DeepMind再登Nature封面推出AlphaTensor

专知会员服务

39+阅读 · 2022年10月6日

Nature论文: DeepMind用AI引导直觉解决数学猜想难题

专知会员服务

31+阅读 · 2021年12月2日

图像分类半监督自监督无监督学习综述，A survey on Semi-, Self- and Unsupervised Learning for Image Classification

专知会员服务

46+阅读 · 2020年7月29日