ParEVO: Synthesizing Code for Irregular Data: High-Performance Parallelism through Agentic Evolution

The transition from sequential to parallel computing is essential for modern high-performance applications but is hindered by the steep learning curve of concurrent programming. This challenge is magnified for irregular data structures (such as sparse graphs, unbalanced trees, and non-uniform meshes) where static scheduling fails and data dependencies are unpredictable. Current Large Language Models (LLMs) often fail catastrophically on these tasks, generating code plagued by subtle race conditions, deadlocks, and sub-optimal scaling. We bridge this gap with ParEVO, a framework designed to synthesize high-performance parallel algorithms for irregular data. Our contributions include: (1) The Parlay-Instruct Corpus, a curated dataset of 13,820 tasks synthesized via a "Critic-Refine" pipeline that explicitly filters for empirically performant algorithms that effectively utilize Work-Span parallel primitives; (2) specialized DeepSeek, Qwen, and Gemini models fine-tuned to align probabilistic generation with the rigorous semantics of the ParlayLib library; and (3) an Evolutionary Coding Agent (ECA) that improves the "last mile" of correctness by iteratively repairing code using feedback from compilers, dynamic race detectors, and performance profilers. On the ParEval benchmark, ParEVO achieves an average 106x speedup (with a maximum of 1103x) across the suite, and a robust 13.6x speedup specifically on complex irregular graph problems, outperforming state-of-the-art commercial models. Furthermore, our evolutionary approach matches state-of-the-art expert human baselines, achieving up to a 4.1x speedup on specific highly-irregular kernels. Source code and datasets are available at https://github.com/WildAlg/ParEVO.

翻译：从顺序计算向并行计算的转变对于现代高性能应用至关重要，但并行编程陡峭的学习曲线阻碍了这一进程。这一挑战在不规则数据结构（如稀疏图、不平衡树和非均匀网格）中尤为突出，因为静态调度在这些场景下失效，且数据依赖关系难以预测。当前的大型语言模型（LLMs）在处理此类任务时常常严重失败，生成的代码充斥着难以察觉的竞态条件、死锁以及次优的扩展性。我们通过ParEVO框架弥合了这一差距，该框架旨在为不规则数据合成高性能并行算法。我们的贡献包括：（1）Parlay-Instruct语料库，这是一个包含13,820个任务的精选数据集，通过"批评-精炼"流程合成，明确筛选出能有效利用Work-Span并行原语且经验证性能优异的算法；（2）专门微调的DeepSeek、Qwen和Gemini模型，旨在使概率生成与ParlayLib库的严格语义对齐；（3）进化编码智能体（ECA），通过迭代地利用编译器、动态竞态检测器和性能分析器的反馈来修复代码，从而提升代码正确性的"最后一公里"。在ParEval基准测试中，ParEVO在整个测试套件上实现了平均106倍（最高1103倍）的加速比，在复杂不规则图问题上实现了稳健的13.6倍加速比，超越了最先进的商业模型。此外，我们的进化方法达到了最先进的人类专家基线水平，在特定的高度不规则内核上实现了高达4.1倍的加速比。源代码和数据集可在https://github.com/WildAlg/ParEVO获取。