Compile Once, Differentiate Everywhere: A Differentiable Meta-Circular Interpreter

The boundary between program execution and gradient-based optimization has long limited the use of code itself as a learnable scientific model. We present a compiler that translates a self-hosting subset of Scheme into differentiable computation graphs for autograd backends. Because the subset can compile its own evaluator, this yields differentiable meta-circular interpretation (DMCI): a compiled Scheme interpreter executes programs supplied as data, while reverse-mode autodiff propagates gradients to continuous constants embedded in those programs. The interpreter is compiled once, so new programs inherit differentiability without recompilation or custom gradient machinery, while retaining closures, recursion, and data structures. We prove that gradients through the compiled interpreter are correct almost everywhere and show that they match direct compilation to numerical precision across 171 recursive and higher-order program-seed pairs. We then use DMCI for program-and-parameter co-search, where a large language model proposes Scheme programs and exact gradients calibrate their continuous parameters through a single frozen interpreter. This enables OpenEvolve-style program search in which an outer loop proposes discrete program structures and DMCI supplies exact gradient-based calibration of each candidate's continuous parameters. On battery capacity-fade data, the search recovers a knee-like degradation structure and improves held-out extrapolation over hand-crafted baselines on the harder early-extrapolation split, matching them on the later split. On a high-dimensional El Nino inverse problem, DMCI optimizes an interpreted Kalman-filter likelihood where gradient-free search fails. These results extend symbolic regression and neurosymbolic search from closed-form expressions to executable, stateful programs, making model-generated code directly optimizable against data.

翻译：程序执行与基于梯度的优化之间的界限长期限制了将代码本身作为可学习科学模型的应用。我们提出一种编译器，能将Scheme语言的自托管子集翻译为适用于自动微分后端的可微分计算图。由于该子集可编译其自身的求值器，由此产生可微分元环解释（DMCI）：编译后的Scheme解释器将程序作为数据执行，同时反向模式自动微分将梯度传播至这些程序中嵌入的连续常数。该解释器只需编译一次，新程序无需重新编译或定制梯度机制即可继承可微分性，同时保留闭包、递归和数据结构。我们证明通过编译后的解释器计算的梯度几乎处处正确，并表明其在171个递归和高阶程序-种子配对中与直接编译的数值精度一致。接着，我们将DMCI用于程序与参数联合搜索：大语言模型提出Scheme程序，精确梯度通过单个冻结解释器校准其连续参数。这使得OpenEvolve风格的搜索成为可能——外层循环生成离散程序结构，DMCI提供每个候选程序连续参数的精确梯度校准。在电池容量衰减数据上，该搜索发现了膝盖状退化结构，并在更具挑战的早期外推分割上提升了相较于手工基线的留出集外推性能，同时在后期分割上与之匹配。在高维厄尔尼诺反问题中，DMCI优化了基于解释的卡尔曼滤波似然函数，而梯度无关搜索则无法收敛。这些结果将符号回归和神经符号搜索从封闭形式表达式扩展到可执行、含状态的程序，使模型生成的代码可直接针对数据进行优化。