AI Agent for Reverse-Engineering Legacy Finite-Difference Code and Translating to Devito

To facilitate the transformation of legacy finite difference implementations into the Devito environment, this study develops an integrated AI agent framework. Retrieval-Augmented Generation (RAG) and open-source Large Language Models are combined through multi-stage iterative workflows in the system's hybrid LangGraph architecture. The agent constructs an extensive Devito knowledge graph through document parsing, structure-aware segmentation, extraction of entity relationships, and Leiden-based community detection. GraphRAG optimisation enhances query performance across semantic communities that include seismic wave simulation, computational fluid dynamics, and performance tuning libraries. A reverse engineering component derives three-level query strategies for RAG retrieval through static analysis of Fortran source code. To deliver precise contextual information for language model guidance, the multi-stage retrieval pipeline performs parallel searching, concept expansion, community-scale retrieval, and semantic similarity analysis. Code synthesis is governed by Pydantic-based constraints to guarantee structured outputs and reliability. A comprehensive validation framework integrates conventional static analysis with the G-Eval approach, covering execution correctness, structural soundness, mathematical consistency, and API compliance. The overall agent workflow is implemented on the LangGraph framework and adopts concurrent processing to support quality-based iterative refinement and state-aware dynamic routing. The principal contribution lies in the incorporation of feedback mechanisms motivated by reinforcement learning, enabling a transition from static code translation toward dynamic and adaptive analytical behavior.

翻译：为促进遗留有限差分实现向Devito环境的转化，本研究构建了一个集成式AI智能体框架。该系统的混合型LangGraph架构通过多阶段迭代工作流，整合了检索增强生成技术（Retrieval-Augmented Generation, RAG）与开源大语言模型。智能体通过文档解析、结构感知分割、实体关系抽取及基于Leiden算法的社区检测，构建了涵盖地震波模拟、计算流体动力学及性能调优库等语义社区的Devito知识图谱。GraphRAG优化机制提升了跨语义社区的查询性能。逆向工程组件通过对Fortran源代码的静态分析，推导出RAG检索的三级查询策略。多阶段检索管道通过并行搜索、概念扩展、社区级检索及语义相似性分析，为语言模型引导提供精准的上下文信息。基于Pydantic的约束体系保障了代码合成的结构化输出与可靠性。综合验证框架整合了传统静态分析与G-Eval方法，覆盖执行正确性、结构稳健性、数学一致性与API合规性四个维度。整个智能体工作流基于LangGraph框架实现，采用并发处理机制支持质量驱动的迭代优化与状态感知动态路由。本研究的核心贡献在于引入强化学习驱动的反馈机制，使静态代码翻译向动态自适应分析行为转变。