Classical Amdahl's Law assumes a fixed decomposition between serial and parallel work and homogeneous replication; historically, it bounds how much parallel speedup is attainable. Modern systems instead combine specialized accelerators with programmable compute, tensor datapaths, and evolving pipelines, while empirical scaling laws shift which stages absorb marginal compute. The central tension is therefore not the serial-versus-parallel split alone, but resource allocation across heterogeneous hardware, given efficiency differences, and workload structures that determine how effectively additional compute can be converted into value. We reformulate Amdahl's Law for modern heterogeneous systems with scalable workloads. The analysis yields a finite collapse threshold: beyond a critical scalable fraction, specialization becomes suboptimal for any efficiency advantage of specialized hardware over programmable compute, and optimal specialized investment falls to zero, a phase transition rather than an asymptotic tail. We use this framework to interpret increasing GPU programmability and why domain-specific AI accelerators have not displaced GPUs.
翻译:经典阿姆达尔定律假定串行与并行工作负载具有固定分解方式且系统同构复制,历史上该定律限制了可实现的并行加速比上限。现代系统则将专用加速器与可编程计算、张量数据通路及演进式流水线相结合,而经验性扩展定律重新决定了计算资源在不同阶段的边际投入方向。因此,核心矛盾并非仅在于串行/并行划分,而在于异构硬件间的资源分配——需权衡效率差异及决定额外计算能力转化为价值的有效性的工作负载结构。我们针对可扩展工作负载的现代异构系统重构了阿姆达尔定律。分析得出有限崩溃阈值:当可扩展部分超过临界比例时,即使专用硬件相较可编程计算具有任意效率优势,其专业化程度都将趋于次优,最优专用投资额归零——此现象呈现相变特性而非渐进尾部效应。我们运用该框架解读GPU可编程性的持续增强趋势,以及领域专用AI加速器为何未能取代GPU。