We present a case study in semi-autonomous mathematics discovery, using Gemini to systematically evaluate 700 conjectures labeled 'Open' in Bloom's Erdős Problems database. We employ a hybrid methodology: AI-driven natural language verification to narrow the search space, followed by human expert evaluation to gauge correctness and novelty. We address 13 problems that were marked 'Open' in the database: 5 through seemingly novel autonomous solutions, and 8 through identification of previous solutions in the existing literature. Our findings suggest that the 'Open' status of the problems was through obscurity rather than difficulty. We also identify and discuss issues arising in applying AI to math conjectures at scale, highlighting the difficulty of literature identification and the risk of ''subconscious plagiarism'' by AI. We reflect on the takeaways from AI-assisted efforts on the Erdős Problems.
翻译:本文提出一项半自主数学发现的案例研究,利用Gemini系统性地评估Bloom埃尔德什问题数据库中标记为"未解决"的700个猜想。我们采用混合方法:首先通过AI驱动的自然语言验证缩小搜索范围,随后由人类专家评估其正确性与新颖性。我们处理了数据库中标记为"未解决"的13个问题:其中5个通过看似新颖的自主解决方案得到解决,另外8个通过识别现有文献中的已有解决方案得以完成。研究结果表明,这些问题之所以呈现"未解决"状态,更多源于其隐蔽性而非难度。我们还识别并讨论了大规模应用AI处理数学猜想时产生的问题,重点指出文献识别的困难性以及AI可能产生"潜意识抄袭"的风险。最后,我们对AI辅助解决埃尔德什问题的工作进行了反思与总结。