The construction of models from data is a significant contributor to the energetic costs of computation. Because of this, understanding how foundational thermodynamic bounds apply to modeling algorithms will be increasingly important. Here, we study the thermodynamic costs of a basic and fundamental modeling algorithm: simple linear regression. Following Landauer, we approximate the thermodynamic lower bound on irreversibly performing both exact linear regression and linear regression via stochastic gradient descent as implemented on floating-point numbers. From this, we derive energycost aware scaling laws for the optimal dataset size for training a linear regression model given a generalization error dependent demand for inference. Additionally, we discuss a method to lower bound the entropy production from the mismatch cost for algorithms with continuous input variables.
翻译:从数据构建模型是计算能耗的主要来源之一。因此,理解基础热力学界限如何适用于建模算法将日益重要。本研究探讨了一个基础且根本的建模算法——简单线性回归——的热力学代价。遵循兰道尔(Landauer)原理,我们近似计算了在浮点数上通过精确线性回归和随机梯度下降法执行线性回归时不可逆操作的热力学下界。由此推导出能量代价感知的缩放律,该定律可根据泛化误差依赖的推理需求确定训练线性回归模型的最优数据集大小。此外,我们提出了一种方法,用于对具有连续输入变量的算法因失配代价产生的熵产生率进行下界估计。