Agentic Framework for Deep Learning workload migration via In-Context Learning

Translating deep learning models from PyTorch's flexible, object-oriented design to JAX's functional, stateless setup is usually a manual and error-prone task. Automated migration is challenging because Large Language Models (LLMs) struggle with strict and dynamic API alignment and are prone to mistakes for exacting operations. We propose a fully autonomous system that combines In-Context Learning (ICL) with oracle-driven self-debugging. First, we curated an ICL context that serves as a strict reference for idiomatic JAX styling and test case generation. Second, instead of depending on the LLM to deduce mathematical outputs, we run the source PyTorch modules to get their actual dynamic tensor states. This creates an unchangeable execution oracle. We then use an autonomous agentic loop to synthesize tests based on the oracle data. The test cases are executed repeatedly, and the traceback is sent back to the LLM for self-correction. Ablations show that combining ICL references with oracle grounding and self-debugging greatly outperforms pure instructional and basic agentic baselines. This improvement does not add an excessive computational overhead. Our lightweight pipeline achieves 91% numerical equivalence (compared to baseline: 9%, instruction + self-debugging: 27%) on neural modules, providing a highly reliable, scalable blueprint for cross-framework migration. This has been validated across several state-of-the-art models including SAM (segment anything), T5, Code Whisper amongst others showing high numerical equivalency. Code: https://github.com/AI-Hypercomputer/accelerator-agents/tree/main/MaxCode

翻译：将深度学习模型从PyTorch灵活的面向对象设计迁移至JAX的函数式、无状态架构通常是一项需要手动完成且易出错的任务。自动化迁移极具挑战性，因为大型语言模型（LLM）难以实现严格且动态的API对齐，且在精确操作中容易出错。我们提出了一种完全自主的系统，该系统将上下文学习（ICL）与预言驱动的自调试相结合。首先，我们整理了一个作为惯用JAX样式和测试用例生成严格参考的ICL上下文。其次，我们不依赖LLM推导数学输出，而是运行源代码PyTorch模块获取其实际动态张量状态，从而创建不可变的执行预言。接着，我们使用自主智能体循环基于预言数据合成测试。测试用例被重复执行，并将回溯信息反馈至LLM进行自我修正。消融实验表明，将ICL参考与预言约束及自调试相结合，其性能远超纯指令型和基础智能体基线方法。这种改进不会带来额外的计算开销。我们的轻量级流水线在神经模块上实现了91%的数值等价性（对比基线：9%，指令+自调试：27%），为跨框架迁移提供了高度可靠且可扩展的蓝图。该方案已在多个最先进模型（包括SAM（分割一切）、T5、Code Whisper等）上经过验证，展现出高度的数值等价性。代码地址：https://github.com/AI-Hypercomputer/accelerator-agents/tree/main/MaxCode