A plausible future mathematical claim must satisfy two constraints: it should follow the direction of prior work and respect the formal dependencies that constrain what can validly follow. Existing approaches typically model only one of these sources, producing claims that are either weakly grounded or insufficiently motivated. We introduce grounded future mathematical generation, where the goal is to generate a plausible future theorem-like claim for an anchor paper using two complementary sources of context: its scientific citation graph and aligned formal theorem dependency graph. To address this setting, we propose COMPOSE, a dual-graph framework that conditions a language model on both scientific citation context and formal theorem structure. To support this setting, we construct a dataset of 108K paired scientific-formal graph examples from arXiv and Mathlib, together with a benchmark of 47K future papers from 2024--2025. Experiments show that COMPOSE outperforms strong baselines on retrieval to real future papers and achieves the best overall performance under LLM-judge evaluation, producing more grounded and mathematically richer outputs. These results show that future mathematical generation benefits from combining scientific context with formal structure. Project page is available at https://david-busbib.github.io/COMPOSE-page/.
翻译:摘要:一个合理的未来数学论断必须满足两个约束:它应当遵循先前研究的方向,同时尊重制约合法后继内容的形式化依赖关系。现有方法通常仅建模其中一种来源,因而生成的论断要么缺乏充分依据,要么动机不足。我们提出了一种有依据的未来数学生成任务,其目标是根据锚点论文的两个互补上下文来源——其科学引文图与对齐的形式化定理依赖图——生成合理的类未来定理论断。针对这一设定,我们提出了COMPOSE框架,这是一个双图框架,通过同时利用科学引文上下文与形式化定理结构来调控语言模型。为支撑该设定,我们基于arXiv与Mathlib构建了包含10.8万个配对科学-形式化图例的数据集,以及包含4.7万篇2024–2025年未来论文的基准测试。实验表明,COMPOSE在检索真实未来论文的任务中优于强基线,并在基于大语言模型评判的评估中取得综合最优性能,输出了更具依据且数学内容更丰富的生成结果。这些结果表明,结合科学上下文与形式化结构能显著提升未来数学生成任务的效果。项目页面:https://david-busbib.github.io/COMPOSE-page/。