NVIDIA数据中心GPU取得了多大进展？ (How Much Progress Has There Been in NVIDIA Datacenter GPUs?)

Graphics Processing Units (GPUs) are the state-of-the-art architecture for essential tasks, ranging from rendering 2D/3D graphics to accelerating workloads in supercomputing centers and, of course, Artificial Intelligence (AI). As GPUs continue improving to satisfy ever-increasing performance demands, analyzing past and current progress becomes paramount in determining future constraints on scientific research. This is particularly compelling in the AI domain, where rapid technological advancements and fierce global competition have led the United States to recently implement export control regulations limiting international access to advanced AI chips. For this reason, this paper studies technical progress in NVIDIA datacenter GPUs released from the mid-2000s until today. Specifically, we compile a comprehensive dataset of datacenter NVIDIA GPUs comprising several features, ranging from computational performance to release price. Then, we examine trends in main GPU features and estimate progress indicators for per-memory bandwidth, per-dollar, and per-watt increase rates. Our main results identify doubling times of 1.44 and 1.69 years for FP16 and FP32 operations (without accounting for sparsity benefits), while FP64 doubling times range from 2.06 to 3.79 years. Off-chip memory size and bandwidth grew at slower rates than computing performance, doubling every 3.32 to 3.53 years. The release prices of datacenter GPUs have roughly doubled every 5.1 years, while their power consumption has approximately doubled every 16 years. Finally, we quantify the potential implications of current U.S. export control regulations in terms of the potential performance gaps that would result if implementation were assumed to be complete and successful. We find that recently proposed changes to export controls would shrink the potential performance gap from 23.6x to 3.54x.

翻译：图形处理器（GPU）已成为从渲染2D/3D图形到加速超级计算中心工作负载，乃至人工智能（AI）等关键任务的最先进架构。随着GPU性能持续提升以满足日益增长的计算需求，分析其历史与当前进展对于研判未来科学研究面临的制约因素至关重要。这在AI领域尤为突出——快速的技术迭代与激烈的全球竞争已促使美国近期实施出口管制法规，限制国际社会获取先进AI芯片。为此，本研究系统考察了自2000年代中期至今发布的NVIDIA数据中心GPU的技术演进。具体而言，我们构建了涵盖计算性能、发布价格等多维度特征的完整数据集，进而分析主要GPU特征的发展趋势，并估算单位内存带宽、单位成本与单位功耗的性能提升率指标。核心研究结果表明：在不考虑稀疏性优势的情况下，FP16与FP32运算性能的倍增周期分别为1.44年与1.69年，而FP64运算的倍增周期介于2.06至3.79年之间。片外内存容量与带宽的增长速度慢于计算性能，其倍增周期为3.32至3.53年。数据中心GPU的发布价格约每5.1年翻倍，而功耗翻倍周期约为16年。最后，我们量化评估了当前美国出口管制法规若被完整有效执行可能导致的性能差距，发现近期拟议的管制方案修订可将潜在性能差距从23.6倍缩小至3.54倍。