The application of physics formulas is a fundamental human capability in numerical reasoning. While existing datasets often rely on implicit mathematical knowledge, they rarely explicitate the underlying formulas. To address this, we introduce FormulaReasoning, a new benchmark for formula-based numerical reasoning comprising 5,324 questions requiring calculations grounded in external physics principles. We provide high-quality, fine-grained annotations in English and Chinese--including formula structures, parameter names, symbols, values, and units--curated through manual effort and LLM-assisted validation. Additionally, we provide a consolidated formula database as an external knowledge source. To further challenge model performance, we develop an extended version of the dataset by coupling multiple questions. We evaluate various architectural and methodological frameworks, including retrieval-augmented methods, modular reasoning (formula generation, parameter extraction, and calculation), and preference-based optimization. Our analysis identifies critical challenges in formula-based reasoning, highlighting significant opportunities for future methodological advancement.
翻译:物理学公式的应用是人类进行数值推理的一项基本能力。尽管现有数据集通常依赖于隐含的数学知识,但它们很少明确揭示背后的公式。为此,我们提出了FormulaReasoning——一个基于公式的数值推理新基准,包含5,324个需要依据外部物理学原理进行计算的问题。我们通过人工努力和LLM辅助验证,提供了高质量、细粒度的中英文标注,包括公式结构、参数名称、符号、数值和单位。此外,我们还提供了一个整合的公式数据库作为外部知识源。为了进一步挑战模型性能,我们通过耦合多个问题开发了数据集的扩展版本。我们评估了多种架构和方法框架,包括检索增强方法、模块化推理(公式生成、参数提取和计算)以及基于偏好的优化。我们的分析揭示了基于公式的推理中的关键挑战,强调了未来方法学进展的重要机遇。