Mastering Olympiad-Level Physics with Artificial Intelligence

Dong-Shan Jian,Xiang Li,Chen-Xu Yan,Hui-Wen Zheng,Zhi-Zhang Bian,You-Le Fang,Ren-Xi He,Jing-Tian Zhang,Ce Meng,Ling-Shi Meng,Bing-Rui Gong,Sheng-Qi Zhang,Yan-Qing Ma

from arxiv, 8 pages, 3 figures, Content from the previous article 2510.01249 is included

Olympiad-level physics problem-solving significantly challenges both humans and artificial intelligence (AI), as it requires integrating appropriate modeling, application of physical principles, and precise calculation within long reasoning processes. In this paper, we introduce LOCA (LOgical Chain Augmentation), an AI agent framework designed for complex physics reasoning. LOCA decomposes long reasoning into serialized atomic and verifiable steps, refining the solution through an augment-review loop. We evaluate LOCA on the 2025 Chinese Physics Olympiad (CPhO) theory examination, a rigorous testbed renowned for its depth and complexity. The framework achieves a near-perfect score of 313 out of 320 points, significantly surpassing the top human competitor and other baseline methods. Furthermore, LOCA attains a near-perfect score of 28.6 out of 30 on the IPhO 2025 examination, demonstrating its strong generalizability across different contexts. Our work points toward the development of trustworthy AI partners in both research and education.

翻译：奥林匹克级别的物理问题求解对人类和人工智能（AI）均构成重大挑战，因其需要在冗长的推理过程中整合适当的建模、物理原理的应用以及精确的计算。本文中，我们提出了LOCA（逻辑链增强）框架，这是一个专为复杂物理推理设计的AI智能体框架。LOCA将冗长推理分解为一系列可序列化的原子化且可验证的步骤，并通过一个增强-审查循环来精化解决方案。我们在以深度和复杂性著称的严格测试平台——2025年中国物理奥林匹克竞赛（CPhO）理论考试上评估了LOCA。该框架取得了接近满分的313分（总分320分），显著超越了顶尖的人类选手及其他基线方法。此外，LOCA在2025年国际物理奥林匹克竞赛（IPhO）考试中获得了28.6分（满分30分）的接近满分成绩，展示了其在不同情境下强大的泛化能力。我们的工作为在科研与教育领域开发可信赖的AI伙伴指明了方向。

相关内容

关注 7110

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

物理AI

专知会员服务

18+阅读 · 2025年12月14日

面向复杂城市系统的物理引导人工智能综述

专知会员服务

26+阅读 · 2025年6月18日

OlymMATH: 奥林匹克级双语数学基准，R1 正确率仅为 21.2%

专知会员服务

11+阅读 · 2025年4月17日

图灵奖得主 Yann LeCun:《机器如何才能达到人类智能水平？》——Yann LeCun, 附Slides及视频

专知会员服务

35+阅读 · 2024年10月28日