We present GoldbachGPU, an open-source framework for large-scale computational verification of Goldbach's conjecture using commodity GPU hardware. Prior GPU-based approaches reported a hard memory ceiling near 10^11 due to monolithic prime-table allocation. We show that this limitation is architectural rather than fundamental: a dense bit-packed prime representation provides a 16x reduction in memory footprint, and a segmented double-sieve design removes the VRAM ceiling entirely. By inverting the verification loop and combining a GPU fast-path with a multi-phase primality oracle, the framework achieves exhaustive verification up to 10^12 on a single NVIDIA RTX 3070 (8 GB VRAM), with no counterexamples found. Each segment requires 14 MB of VRAM, yielding O(N) wall-clock time and O(1) memory in N. A rigorous CPU fallback guarantees mathematical completeness, though it was never invoked in practice. An arbitrary-precision checker using GMP and OpenMP extends single-number verification to 10^10000 via a synchronised batch-search strategy. The segmented architecture also exhibits clean multi-GPU scaling on data-centre hardware (tested on 8 x H100). All code is open-source, documented, and reproducible on both commodity and high-end hardware.
翻译:本文提出GoldbachGPU,一个利用商用GPU硬件进行哥德巴赫猜想大规模计算验证的开源框架。先前基于GPU的方法因采用单一素数表分配方式,其内存上限被限制在10^11附近。我们证明该限制源于架构设计而非本质约束:采用密集位压缩素数表示法可将内存占用降低16倍,而分段双筛设计则完全消除了显存上限。通过反转验证循环结构,并结合GPU快速路径与多阶段素数判定机制,本框架在单张NVIDIA RTX 3070显卡(8GB显存)上实现了直至10^12的穷举验证,未发现反例。每个计算段仅需14MB显存,实现时间复杂度O(N)与空间复杂度O(1)(关于N)。严谨的CPU后备验证机制保证了数学完备性(实践中从未触发)。基于GMP与OpenMP的任意精度验证器通过同步批量搜索策略,将单数验证范围扩展至10^10000。分段架构在数据中心硬件(8×H100测试环境)上展现出清晰的多GPU扩展特性。所有代码均已开源并附详细文档,可在商用及高端硬件上完整复现。