Unlike the von Neumann architecture, which separates computation from memory, the brain tightly integrates them, an organization that large language models increasingly resemble. The crucial difference lies in the ratio of energy spent on computation versus data access: in the brain, most energy fuels compute, while in von Neumann architectures, data movement dominates. To capture this imbalance, we introduce the \emph{operation-operand disjunction constant} $G_d$, a dimensionless measure of the energy required for data transport relative to computation. As part of this framework, we propose the metaphor of \emph{data gravity}: just as mass exerts gravitational pull, large and frequently accessed data sets attract computation. We develop expressions for optimal computation placement and show that bringing the computation closer to the data can reduce energy consumption by a factor of $G_d^{(β- 1)/2}$, where $β\in (1, 3)$ captures the empirically observed distance-dependent energy scaling. We demonstrate that these findings are consistent with measurements across processors from 45\,nm to 7\,nm, as well as with results from processing-in-memory (PIM) architectures. High $G_d$ values are limiting; as $G_d$ increases, the energy required for data movement threatens to stall progress, slowing the scaling of large language models and pushing modern computing toward a plateau. Unless computation is realigned with data gravity, the growth of AI may be capped not by algorithms but by physics.
翻译:暂无翻译