Many optimization problems arise repeatedly from a fixed but unknown distribution. Even when the worst-case problem is hard, this distribution may carry reusable structure, such as recurring geometry, decompositions, or resource patterns. We study how to infer such structure from sample instances and compile it into solver code that runs faster on future instances while preserving solution quality. Our central abstraction is a \emph{solver hint}: distribution-specific structure inferred from samples and used to specialize a solver. We prove that the empirically fastest sample-consistent solver generalizes in both correctness and runtime over fixed solver libraries, and that identifiable hints can be recovered from polynomially many samples. We instantiate the framework with LLM code agents on $21$ combinatorial-optimization distributions across $7$ problem classes. The synthesized solvers reach mean normalized quality $0.971$ while running orders of magnitude faster than classical heuristics, Gurobi, and time-limited exact backends, though they do not dominate every baseline on every family. Against LLM synthesis baselines, they are faster than one-shot Codex, one-shot Claude Code, and a best-of-$5$ open-model variant; they improve quality over Claude Code and best-of-$5$, while nearly matching Codex quality and running substantially faster. This isolates the contribution of the iterative synthesis loop without claiming uniform domination over every LLM baseline. On the PACE 2025 Dominating Set private instances, the synthesized solver is valid on all $100$ graphs and runs roughly $75\times$--$125\times$ faster than released competition solvers, within a few percent of their solution size. These results suggest LLM agents can discover distribution-specific computational shortcuts and compile them into efficient solver code.
翻译:暂无翻译