Classical Amdahl's Law assumes a fixed decomposition between serial and parallel work and homogeneous replication; historically, it bounds how much parallel speedup is attainable. Modern systems instead combine specialized accelerators with programmable compute, tensor datapaths, and evolving pipelines, while empirical scaling laws shift which stages absorb marginal compute. The central tension is therefore not the serial-versus-parallel split alone, but resource allocation across heterogeneous hardware, given efficiency differences, and workload structures that determine how effectively additional compute can be converted into value. We reformulate Amdahl's Law for modern heterogeneous systems with scalable workloads. The analysis yields a finite collapse threshold: beyond a critical scalable fraction, specialization becomes suboptimal for any efficiency advantage of specialized hardware over programmable compute, and optimal specialized investment falls to zero, a phase transition rather than an asymptotic tail. We use this framework to interpret increasing GPU programmability and why domain-specific AI accelerators have not displaced GPUs.
翻译:经典阿姆达尔定律假设串行与并行工作的固定分解以及同构复制;在历史上,它约束了可实现的并行加速上限。现代系统则结合了专用加速器与可编程计算、张量数据通路以及不断演进的流水线,而经验性的扩展定律不断改变哪些阶段吸收了边际计算资源。因此,核心矛盾不再仅是串行与并行的划分,而是异构硬件间的资源分配问题——需考虑效率差异以及决定额外计算资源能否有效转化为价值的工作负载结构。我们将阿姆达尔定律重新表述为适用于具备可扩展工作负载的现代异构系统。该分析揭示了一个有限崩溃阈值:当可扩展部分比例超过临界值时,无论专用硬件相对于可编程计算具有何种效率优势,专用化均会变得次优,且最优专用化投资降至零——这是一种相变现象,而非渐近拖尾。我们利用这一框架解释GPU可编程性日益增强的现象,以及领域特定AI加速器为何未能取代GPU。