Differentiable programming offers transformative capabilities for scientific modeling, enabling gradient-based parameter estimation, sensitivity analysis, and data assimilation. Yet, migrating legacy codebases into differentiable frameworks remains a challenge. We present a five-phase LLM-based agentic pipeline that translates legacy Fortran into JAX: static dependency analysis determines module translation order from the full call graph; iterative compile-repair loops correct errors autonomously; and a Fortran reference oracle enforces numerical parity at the module level before integration and gradient verification. We instantiate and evaluate the pipeline on CLM-ml-v2, a 19,000-line Fortran land surface model, and analyze agent behavior across 73 module translation tasks. The resulting differentiable model computes the complete Jacobian in a single backward pass, recovers physical parameters in eight times fewer steps than gradient-free optimization, and achieves a 24 times wall-clock speedup over sequential Fortran at ensemble size N=2,048. Both the translated model and pipeline infrastructure are released as a reusable framework for differentiating other Earth system model components.
翻译:可微编程为科学建模提供了变革性能力,支持基于梯度的参数估计、敏感性分析和数据同化。然而,将遗留代码库迁移至可微框架仍面临挑战。我们提出一种基于大语言模型的五阶段智能体流水线,可将遗留Fortran代码翻译为JAX:静态依存分析根据完整调用图确定模块翻译顺序;迭代编译-修复循环自主修正错误;Fortran参考验证器在集成与梯度验证前,确保模块级数值一致性。我们在CLM-ml-v2(含19,000行Fortran代码的地表模型)上实例化并评估该流水线,分析73个模块翻译任务中的智能体行为。所生成的可微模型可在单次反向传播中计算完整雅可比矩阵,以比无梯度优化少八倍的步数恢复物理参数,并在集成规模N=2,048时实现比顺序Fortran快24倍的墙钟加速。本翻译模型及流水线基础设施已作为可复用框架发布,用于区分其他地球系统模型组件。