Generating high-quality Physics Word Problems (PWPs) that are novel, complex, and solvable remains a challenging and underexplored problem in educational content generation. Existing approaches, many adapted from Math Word Problem (MWP) generation, often produce ambiguous, unsolvable, or structurally simple questions with limited linguistic diversity. We introduce ARVRE (Agentic Retrieval Value Reinforced Equation-chain), a two-stage framework for generating diverse and mathematically valid PWPs. In the first stage, a form of offline temporal-difference learning is used to construct valid chains of physics equations, while an agentic retrieval-augmented generation (RAG) framework dynamically selects topic-specific concepts and vocabulary. This design enables explicit control over problem structure and difficulty. In the second stage, a Large Language Model (LLM) converts the equation chain and retrieved concepts into a natural-language physics question. By grounding generation in valid equation chains, our method preserves mathematical correctness while promoting linguistic diversity and contextual richness. Human and automated evaluations demonstrate that ARVRE generates PWPs that are more complex, novel, and solvable than those produced by existing approaches. These results highlight the potential of combining reinforcement learning, retrieval, and LLMs for reliable generation of educational physics content.
翻译:生成高质量、新颖、复杂且可解的物理文字题(PWP)在教育内容生成中仍是一个具有挑战性且尚未充分探索的问题。现有方法(多改编自数学文字题生成)常产生语义模糊、不可解或结构简单且语言多样性受限的题目。我们提出ARVRE(代理检索价值强化方程链)——一种两阶段框架,用于生成多样且数学合理的物理文字题。第一阶段利用离线时间差分学习构建有效物理方程链,同时通过代理检索增强生成(RAG)框架动态选择主题相关概念与词汇。该设计可显式控制问题结构与难度。第二阶段由大型语言模型(LLM)将方程链与检索所得概念转化为自然语言物理问题。通过将生成过程锚定于有效方程链,该方法在保证数学正确性的同时,促进了语言多样性与语境丰富性。人工与自动评估表明,ARVRE生成的物理文字题在复杂性、新颖性与可解性上均优于现有方法。这些结果揭示了结合强化学习、检索与LLM实现教育物理内容可靠生成的潜力。