Parallel programming remains one of the most challenging aspects of High-Performance Computing (HPC), requiring deep knowledge of synchronization, communication, and memory models. While modern C++ standards and frameworks like OpenMP and MPI have simplified parallelism, mastering these paradigms is still complex. Recently, Large Language Models (LLMs) have shown promise in automating code generation, but their effectiveness in producing correct and efficient HPC code is not well understood. In this work, we systematically evaluate leading LLMs including ChatGPT 4 and 5, Claude, and LLaMA on the task of generating C++ implementations of the Mandelbrot set using shared-memory, directive-based, and distributed-memory paradigms. Each generated program is compiled and executed with GCC 11.5.0 to assess its correctness, robustness, and scalability. Results show that ChatGPT-4 and ChatGPT-5 achieve strong syntactic precision and scalable performance.
翻译:并行编程始终是高性能计算(HPC)最具挑战性的领域之一,需要深入理解同步、通信与内存模型。尽管现代C++标准及OpenMP、MPI等框架已简化了并行化过程,但掌握这些范式依然复杂。近年来,大型语言模型(LLMs)在自动化代码生成方面展现出潜力,但其生成正确高效HPC代码的能力尚未得到充分认知。本研究系统评估了包括ChatGPT 4与5、Claude及LLaMA在内的主流LLMs,测试其使用共享内存、基于指令以及分布式内存范式生成曼德博集合C++实现的能力。每个生成程序均通过GCC 11.5.0编译执行,以评估其正确性、鲁棒性与可扩展性。结果表明,ChatGPT-4与ChatGPT-5在语法准确性与可扩展性能方面表现突出。