Agentic Retrieval and Reinforcement Learned Equation Chains: A Controlled Generation Framework for Complex and Novel Physics Word Problems - 专知论文

会员服务 ·

0

数学 · 结构 · 多样性 · 强化学习 · 大语言模型 ·

Agentic Retrieval and Reinforcement Learned Equation Chains: A Controlled Generation Framework for Complex and Novel Physics Word Problems

翻译：代理检索与强化学习方程链：面向复杂新颖物理文字题的受控生成框架

Tirthankar Mittra

Generating high-quality Physics Word Problems (PWPs) that are novel, complex, and solvable remains a challenging and underexplored problem in educational content generation. Existing approaches, many adapted from Math Word Problem (MWP) generation, often produce ambiguous, unsolvable, or structurally simple questions with limited linguistic diversity. We introduce ARVRE (Agentic Retrieval Value Reinforced Equation-chain), a two-stage framework for generating diverse and mathematically valid PWPs. In the first stage, a form of offline temporal-difference learning is used to construct valid chains of physics equations, while an agentic retrieval-augmented generation (RAG) framework dynamically selects topic-specific concepts and vocabulary. This design enables explicit control over problem structure and difficulty. In the second stage, a Large Language Model (LLM) converts the equation chain and retrieved concepts into a natural-language physics question. By grounding generation in valid equation chains, our method preserves mathematical correctness while promoting linguistic diversity and contextual richness. Human and automated evaluations demonstrate that ARVRE generates PWPs that are more complex, novel, and solvable than those produced by existing approaches. These results highlight the potential of combining reinforcement learning, retrieval, and LLMs for reliable generation of educational physics content.

翻译：生成高质量、新颖、复杂且可解的物理文字题（PWP）在教育内容生成中仍是一个具有挑战性且尚未充分探索的问题。现有方法（多改编自数学文字题生成）常产生语义模糊、不可解或结构简单且语言多样性受限的题目。我们提出ARVRE（代理检索价值强化方程链）——一种两阶段框架，用于生成多样且数学合理的物理文字题。第一阶段利用离线时间差分学习构建有效物理方程链，同时通过代理检索增强生成（RAG）框架动态选择主题相关概念与词汇。该设计可显式控制问题结构与难度。第二阶段由大型语言模型（LLM）将方程链与检索所得概念转化为自然语言物理问题。通过将生成过程锚定于有效方程链，该方法在保证数学正确性的同时，促进了语言多样性与语境丰富性。人工与自动评估表明，ARVRE生成的物理文字题在复杂性、新颖性与可解性上均优于现有方法。这些结果揭示了结合强化学习、检索与LLM实现教育物理内容可靠生成的潜力。

0

相关内容

数学是关于数量、结构、变化等主题的探索。

PaperOrchestra：一种面向自动化 AI 学术论文撰写的多智能体框架

PaperOrchestra：一种面向自动化 AI 学术论文撰写的多智能体框架

专知会员服务

13+阅读 · 4月9日

【CVPR2026】面向物理一致性视频生成的事件中心型因果思维链

【CVPR2026】面向物理一致性视频生成的事件中心型因果思维链

专知会员服务

11+阅读 · 3月11日

【伯克利JD Co-Reyes博士论文】建立强化学习算法泛化:从潜在动力学模型到元学习，Building Reinforcement Learning Algorithms that Generalize: From Latent Dynamics Models to Meta-Learning

【伯克利JD Co-Reyes博士论文】建立强化学习算法泛化:从潜在动力学模型到元学习，Building Reinforcement Learning Algorithms that Generalize: From Latent Dynamics Models to Meta-Learning

专知会员服务

45+阅读 · 2022年3月6日

【综述】生成式对抗网络(GANs)最新2020综述:挑战、解决方案和未来方向，Generative Adversarial Networks (GANs): Challenges, Solutions, and Future Directions

【综述】生成式对抗网络(GANs)最新2020综述:挑战、解决方案和未来方向，Generative Adversarial Networks (GANs): Challenges, Solutions, and Future Directions

专知会员服务

63+阅读 · 2020年5月12日

【论文推荐】逆问题，深度学习，对称性破缺，Inverse Problems, Deep Learning, and Symmetry Breaking

【论文推荐】逆问题，深度学习，对称性破缺，Inverse Problems, Deep Learning, and Symmetry Breaking

专知会员服务

26+阅读 · 2020年3月27日

【NeurlPS2019论文强烈推荐】vGraph:联合社区检测和节点表示学习的生成模型，vGraph: A Generative Model for Joint Community Detection and Node Representational Learning

【NeurlPS2019论文强烈推荐】vGraph:联合社区检测和节点表示学习的生成模型，vGraph: A Generative Model for Joint Community Detection and Node Representational Learning

专知会员服务

30+阅读 · 2019年12月17日

【微软Alekh等开放新书】强化学习理论与算法（Reinforcement Learning:Theory and Algorithms），附83页pdf

【微软Alekh等开放新书】强化学习理论与算法（Reinforcement Learning:Theory and Algorithms），附83页pdf

专知会员服务

122+阅读 · 2019年11月24日

【AAAI2020论文-腾讯】通过稠密边界发生器快速学习时间动作方案（Fast Learning of Temporal Action Proposal via Dense Boundary Generator）

【AAAI2020论文-腾讯】通过稠密边界发生器快速学习时间动作方案（Fast Learning of Temporal Action Proposal via Dense Boundary Generator）

专知会员服务

12+阅读 · 2019年11月15日

【CCF优秀博士学位论文奖-2019】面向多种学习任务的深度生成模型，清华大学李崇轩

【CCF优秀博士学位论文奖-2019】面向多种学习任务的深度生成模型，清华大学李崇轩

专知会员服务

52+阅读 · 2019年11月8日

【VLDB2019 tutorial】TextCube：自动构建和多维探索，TextCube: Automated Construction and Multidimensional Exploration，韩家炜，Jingbo Shang

【VLDB2019 tutorial】TextCube：自动构建和多维探索，TextCube: Automated Construction and Multidimensional Exploration，韩家炜，Jingbo Shang

专知会员服务

27+阅读 · 2019年8月29日

【AAAI2021】知识图谱增强的预训练模型的生成式常识推理

【AAAI2021】知识图谱增强的预训练模型的生成式常识推理

专知

29+阅读 · 2021年1月25日

最新《知识驱动的文本生成》综述论文，44页pdf

最新《知识驱动的文本生成》综述论文，44页pdf

专知

26+阅读 · 2020年10月14日

GAN新书《生成式深度学习》Generative Deep Learning，附379页全文PDF

GAN新书《生成式深度学习》Generative Deep Learning，附379页全文PDF

专知

96+阅读 · 2019年9月30日

北大、清华、微软联合提出RepPoints，比边界框更好用的目标检测方法

北大、清华、微软联合提出RepPoints，比边界框更好用的目标检测方法

全球人工智能

13+阅读 · 2019年4月30日

视频生成的前沿论文，看我们推荐的7篇就够了

视频生成的前沿论文，看我们推荐的7篇就够了

人工智能前沿讲习班

34+阅读 · 2018年12月30日

<好书推荐> -《Pro Deep Learning with TensorFlow》分享

<好书推荐> -《Pro Deep Learning with TensorFlow》分享

深度学习与NLP

12+阅读 · 2018年9月13日

《pyramid Attention Network for Semantic Segmentation》

《pyramid Attention Network for Semantic Segmentation》

统计学习与视觉计算组

44+阅读 · 2018年8月30日

【论文推荐】最新八篇视频描述生成相关论文—在线视频理解、联合定位和描述事件、生成视频、跨模态注意力机制、联合事件检测和描述

【论文推荐】最新八篇视频描述生成相关论文—在线视频理解、联合定位和描述事件、生成视频、跨模态注意力机制、联合事件检测和描述

专知

11+阅读 · 2018年6月4日

【论文】所见所想所真，对抗学习GAN提升跨模态检索效果！阿里巴巴AI Labs等团队最新工作

【论文】所见所想所真，对抗学习GAN提升跨模态检索效果！阿里巴巴AI Labs等团队最新工作

专知

12+阅读 · 2017年12月21日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

针对大规模环境下复杂任务的策略搜索强化学习方法研究

国家自然科学基金

43+阅读 · 2015年12月31日

拟线性抛物方程及微机电系统新动力学模型的基础理论研究

国家自然科学基金

0+阅读 · 2014年12月31日

复杂构型下多介质流体力学ALE方法

国家自然科学基金

0+阅读 · 2014年12月31日

海量Web用户生成内容物化关键技术

国家自然科学基金

2+阅读 · 2014年12月31日

物联网关键技术RFID系统安全测试的仿真架构.评估模型和受攻击模式的研究和实践

国家自然科学基金

2+阅读 · 2014年12月31日

多机理和器件界面调控侧链聚合物实现多进制

国家自然科学基金

0+阅读 · 2014年12月31日

奇异线性方程组和具有特定结构的非线性问题的研究与应用

国家自然科学基金

0+阅读 · 2014年12月31日

基于结构化方法的复杂研发项目多领域集成分析与优化研究

国家自然科学基金

2+阅读 · 2014年12月31日

一类非线性发展方程的定性理论

国家自然科学基金

0+阅读 · 2014年12月31日

基于支持向量机的复杂连续系统强化学习控制研究

国家自然科学基金

12+阅读 · 2008年12月31日

Interactor: Agentic RL oriented Iterative Creation for Ad Description Generation in Sponsored Search

Arxiv

0+阅读 · 6月14日

CausalMotion: Structured Physical Reasoning as Keyframe and Trajectory Guidance for Training-Free Video Generation

Arxiv

0+阅读 · 6月12日

VeriGeo: Controllable Geometry Question Generation with Numerical and Analytical Verification

Arxiv

0+阅读 · 6月12日

ReSum: Synergizing LLM Reasoning and Summarization with Reinforcement Learning

Arxiv

0+阅读 · 6月11日

Chain of Evidence: Pixel-Level Visual Attribution for Iterative Retrieval-Augmented Generation

Arxiv

0+阅读 · 5月23日

Ax-Prover: A Deep Reasoning Agentic Framework for Theorem Proving in Mathematics and Quantum Physics

Arxiv

0+阅读 · 5月22日

Integrating Chain-of-Thought into Generative Retrieval: A Preliminary Study

Arxiv

0+阅读 · 5月21日

PhyDetEx: Detecting and Explaining the Physical Plausibility of T2V Models

Arxiv

0+阅读 · 5月18日

OneSearch-V2: The Latent Reasoning Enhanced Self-distillation Generative Search Framework

Arxiv

0+阅读 · 5月14日

Retrieval-Augmented Tutoring for Algorithm Tracing and Problem-Solving in AI Education

Arxiv

0+阅读 · 5月13日

VIP会员

文章信息

相关主题

大语言模型

最新内容

ICML 2026 Spotlight | SmoothSMoE：解析稀疏 MoE 路由不连续

ICML 2026 Spotlight | SmoothSMoE：解析稀疏 MoE 路由不连续

专知会员服务

3+阅读 · 6月18日

综述 | 周期表视角下的大模型推理：范式、方法与失败模式

综述 | 周期表视角下的大模型推理：范式、方法与失败模式

专知会员服务

4+阅读 · 6月18日

《廉价自杀式无人机战争的军事战略影响：乌克兰和伊朗案例研究》

《廉价自杀式无人机战争的军事战略影响：乌克兰和伊朗案例研究》

专知会员服务

9+阅读 · 6月18日

《面向反无人机作战的联邦式可解释射频–光电/红外情报融合：边缘人工智能优化、电子战韧性及分布式监视验证》

《面向反无人机作战的联邦式可解释射频–光电/红外情报融合：边缘人工智能优化、电子战韧性及分布式监视验证》

专知会员服务

8+阅读 · 6月18日

ICML 2026 | FR3D：解耦自车运动的未来动态三维重建世界模型

ICML 2026 | FR3D：解耦自车运动的未来动态三维重建世界模型

专知会员服务

4+阅读 · 6月17日

【伯克利博士论文】迈向可扩展与自我演进的大语言模型智能体

【伯克利博士论文】迈向可扩展与自我演进的大语言模型智能体

专知会员服务

7+阅读 · 6月17日

学习数据的几何：形状空间分析数学综述

学习数据的几何：形状空间分析数学综述

专知会员服务

6+阅读 · 6月17日

《现代防空系统综述：架构、传感器、拦截器及新兴威胁环境对基础设施受限防御环境的影响》2026最新长综述

《现代防空系统综述：架构、传感器、拦截器及新兴威胁环境对基础设施受限防御环境的影响》2026最新长综述

专知会员服务

9+阅读 · 6月17日

定向能反无人机系统最新发展动态

定向能反无人机系统最新发展动态

专知会员服务

7+阅读 · 6月17日

从燃煤战舰到算法战争：水面指挥的永恒要求

从燃煤战舰到算法战争：水面指挥的永恒要求

专知会员服务

4+阅读 · 6月17日

《短程弹道再入飞行器拦截时间中的一项异常现象》

《短程弹道再入飞行器拦截时间中的一项异常现象》

专知会员服务

6+阅读 · 6月17日

《基于回归方法与任务上下文的对抗环境动态战术网络报文优先级排序》

《基于回归方法与任务上下文的对抗环境动态战术网络报文优先级排序》

专知会员服务

7+阅读 · 6月17日

美智库《战术级指挥控制的迫切要求：构建弹性机动式指挥控制网络》报告

美智库《战术级指挥控制的迫切要求：构建弹性机动式指挥控制网络》报告

专知会员服务

6+阅读 · 6月17日

《韩国国防政策与军备出口：韩国安全与国防政策如何塑造其国防工业与军备出口格局》最新100页报告

《韩国国防政策与军备出口：韩国安全与国防政策如何塑造其国防工业与军备出口格局》最新100页报告

专知会员服务

5+阅读 · 6月17日

ICML 2026 | VOTP：用视频基础模型与最优传输，让离线偏好强化学习只需少量反馈

ICML 2026 | VOTP：用视频基础模型与最优传输，让离线偏好强化学习只需少量反馈

专知会员服务

6+阅读 · 6月16日

相关VIP内容

PaperOrchestra：一种面向自动化 AI 学术论文撰写的多智能体框架

PaperOrchestra：一种面向自动化 AI 学术论文撰写的多智能体框架

专知会员服务

13+阅读 · 4月9日

【CVPR2026】面向物理一致性视频生成的事件中心型因果思维链

【CVPR2026】面向物理一致性视频生成的事件中心型因果思维链

专知会员服务

11+阅读 · 3月11日

【伯克利JD Co-Reyes博士论文】建立强化学习算法泛化:从潜在动力学模型到元学习，Building Reinforcement Learning Algorithms that Generalize: From Latent Dynamics Models to Meta-Learning

【伯克利JD Co-Reyes博士论文】建立强化学习算法泛化:从潜在动力学模型到元学习，Building Reinforcement Learning Algorithms that Generalize: From Latent Dynamics Models to Meta-Learning

专知会员服务

45+阅读 · 2022年3月6日

【综述】生成式对抗网络(GANs)最新2020综述:挑战、解决方案和未来方向，Generative Adversarial Networks (GANs): Challenges, Solutions, and Future Directions

【综述】生成式对抗网络(GANs)最新2020综述:挑战、解决方案和未来方向，Generative Adversarial Networks (GANs): Challenges, Solutions, and Future Directions

专知会员服务

63+阅读 · 2020年5月12日

【论文推荐】逆问题，深度学习，对称性破缺，Inverse Problems, Deep Learning, and Symmetry Breaking

【论文推荐】逆问题，深度学习，对称性破缺，Inverse Problems, Deep Learning, and Symmetry Breaking

专知会员服务

26+阅读 · 2020年3月27日

【NeurlPS2019论文强烈推荐】vGraph:联合社区检测和节点表示学习的生成模型，vGraph: A Generative Model for Joint Community Detection and Node Representational Learning

【NeurlPS2019论文强烈推荐】vGraph:联合社区检测和节点表示学习的生成模型，vGraph: A Generative Model for Joint Community Detection and Node Representational Learning

专知会员服务

30+阅读 · 2019年12月17日

【微软Alekh等开放新书】强化学习理论与算法（Reinforcement Learning:Theory and Algorithms），附83页pdf

【微软Alekh等开放新书】强化学习理论与算法（Reinforcement Learning:Theory and Algorithms），附83页pdf

专知会员服务

122+阅读 · 2019年11月24日

【AAAI2020论文-腾讯】通过稠密边界发生器快速学习时间动作方案（Fast Learning of Temporal Action Proposal via Dense Boundary Generator）

【AAAI2020论文-腾讯】通过稠密边界发生器快速学习时间动作方案（Fast Learning of Temporal Action Proposal via Dense Boundary Generator）

专知会员服务

12+阅读 · 2019年11月15日

【CCF优秀博士学位论文奖-2019】面向多种学习任务的深度生成模型，清华大学李崇轩

【CCF优秀博士学位论文奖-2019】面向多种学习任务的深度生成模型，清华大学李崇轩

专知会员服务

52+阅读 · 2019年11月8日

【VLDB2019 tutorial】TextCube：自动构建和多维探索，TextCube: Automated Construction and Multidimensional Exploration，韩家炜，Jingbo Shang

【VLDB2019 tutorial】TextCube：自动构建和多维探索，TextCube: Automated Construction and Multidimensional Exploration，韩家炜，Jingbo Shang

专知会员服务

27+阅读 · 2019年8月29日

热门VIP内容

开通专知VIP会员享更多权益服务

综述 | 周期表视角下的大模型推理：范式、方法与失败模式

《面向反无人机作战的联邦式可解释射频–光电/红外情报融合：边缘人工智能优化、电子战韧性及分布式监视验证》

ICML 2026 Spotlight | SmoothSMoE：解析稀疏 MoE 路由不连续

《廉价自杀式无人机战争的军事战略影响：乌克兰和伊朗案例研究》

相关资讯

【AAAI2021】知识图谱增强的预训练模型的生成式常识推理

【AAAI2021】知识图谱增强的预训练模型的生成式常识推理

专知

29+阅读 · 2021年1月25日

最新《知识驱动的文本生成》综述论文，44页pdf

最新《知识驱动的文本生成》综述论文，44页pdf

专知

26+阅读 · 2020年10月14日

GAN新书《生成式深度学习》Generative Deep Learning，附379页全文PDF

GAN新书《生成式深度学习》Generative Deep Learning，附379页全文PDF

专知

96+阅读 · 2019年9月30日

北大、清华、微软联合提出RepPoints，比边界框更好用的目标检测方法

北大、清华、微软联合提出RepPoints，比边界框更好用的目标检测方法

全球人工智能

13+阅读 · 2019年4月30日

视频生成的前沿论文，看我们推荐的7篇就够了

视频生成的前沿论文，看我们推荐的7篇就够了

人工智能前沿讲习班

34+阅读 · 2018年12月30日

<好书推荐> -《Pro Deep Learning with TensorFlow》分享

<好书推荐> -《Pro Deep Learning with TensorFlow》分享

深度学习与NLP

12+阅读 · 2018年9月13日

《pyramid Attention Network for Semantic Segmentation》

《pyramid Attention Network for Semantic Segmentation》

统计学习与视觉计算组

44+阅读 · 2018年8月30日

【论文推荐】最新八篇视频描述生成相关论文—在线视频理解、联合定位和描述事件、生成视频、跨模态注意力机制、联合事件检测和描述

【论文推荐】最新八篇视频描述生成相关论文—在线视频理解、联合定位和描述事件、生成视频、跨模态注意力机制、联合事件检测和描述

专知

11+阅读 · 2018年6月4日

【论文】所见所想所真，对抗学习GAN提升跨模态检索效果！阿里巴巴AI Labs等团队最新工作

【论文】所见所想所真，对抗学习GAN提升跨模态检索效果！阿里巴巴AI Labs等团队最新工作

专知

12+阅读 · 2017年12月21日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

相关论文

Interactor: Agentic RL oriented Iterative Creation for Ad Description Generation in Sponsored Search

Arxiv

0+阅读 · 6月14日

CausalMotion: Structured Physical Reasoning as Keyframe and Trajectory Guidance for Training-Free Video Generation

Arxiv

0+阅读 · 6月12日

VeriGeo: Controllable Geometry Question Generation with Numerical and Analytical Verification

Arxiv

0+阅读 · 6月12日

ReSum: Synergizing LLM Reasoning and Summarization with Reinforcement Learning

Arxiv

0+阅读 · 6月11日

Chain of Evidence: Pixel-Level Visual Attribution for Iterative Retrieval-Augmented Generation

Arxiv

0+阅读 · 5月23日

Ax-Prover: A Deep Reasoning Agentic Framework for Theorem Proving in Mathematics and Quantum Physics

Arxiv

0+阅读 · 5月22日

Integrating Chain-of-Thought into Generative Retrieval: A Preliminary Study

Arxiv

0+阅读 · 5月21日

PhyDetEx: Detecting and Explaining the Physical Plausibility of T2V Models

Arxiv

0+阅读 · 5月18日

OneSearch-V2: The Latent Reasoning Enhanced Self-distillation Generative Search Framework

Arxiv

0+阅读 · 5月14日

Retrieval-Augmented Tutoring for Algorithm Tracing and Problem-Solving in AI Education

Arxiv

0+阅读 · 5月13日

相关基金

针对大规模环境下复杂任务的策略搜索强化学习方法研究

国家自然科学基金

43+阅读 · 2015年12月31日

拟线性抛物方程及微机电系统新动力学模型的基础理论研究

国家自然科学基金

0+阅读 · 2014年12月31日

复杂构型下多介质流体力学ALE方法

国家自然科学基金

0+阅读 · 2014年12月31日

海量Web用户生成内容物化关键技术

国家自然科学基金

2+阅读 · 2014年12月31日

物联网关键技术RFID系统安全测试的仿真架构.评估模型和受攻击模式的研究和实践

国家自然科学基金

2+阅读 · 2014年12月31日

多机理和器件界面调控侧链聚合物实现多进制

国家自然科学基金

0+阅读 · 2014年12月31日

奇异线性方程组和具有特定结构的非线性问题的研究与应用

国家自然科学基金

0+阅读 · 2014年12月31日

基于结构化方法的复杂研发项目多领域集成分析与优化研究

国家自然科学基金

2+阅读 · 2014年12月31日

一类非线性发展方程的定性理论

国家自然科学基金

0+阅读 · 2014年12月31日

基于支持向量机的复杂连续系统强化学习控制研究

国家自然科学基金

12+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员