With the rapid growth of Artificial Intelligence (AI) applications supported by deep learning (DL), the energy efficiency of these applications has an increasingly large impact on sustainability. We introduce Smaragdine, a new energy accounting system for tensor-based DL programs implemented with TensorFlow. At the heart of Smaragdine is a novel white-box methodology of energy accounting: Smaragdine is aware of the internal structure of the DL program, which we call tensor-aware energy accounting. With Smaragdine, the energy consumption of a DL program can be broken down into units aligned with its logical hierarchical decomposition structure. We apply Smaragdine for understanding the energy behavior of BERT, one of the most widely used language models. Layer-by-layer and tensor-by-tensor, Smaragdine is capable of identifying the highest energy/power-consuming components of BERT. Furthermore, we conduct two case studies on how Smaragdine supports downstream toolchain building, one on the comparative energy impact of hyperparameter tuning of BERT, the other on the energy behavior evolution when BERT evolves to its next generation, ALBERT.
翻译:随着深度学习支持的AI应用快速增长,这些应用的能效对可持续性的影响日益显著。我们提出Smaragdine——一种面向基于TensorFlow的张量型深度学习程序的新型能量核算系统。Smaragdine的核心是一种新颖的白盒能量核算方法论:通过感知深度学习程序的内部结构(我们称之为张量感知的能量核算),Smaragdine可将深度学习程序的能耗分解为与其逻辑层次分解结构对齐的单元。我们应用Smaragdine分析最广泛使用的语言模型之一BERT的能耗行为。通过逐层逐张量的细粒度分析,Smaragdine能够精准定位BERT中能耗/功耗最高的组件。此外,我们开展了两项案例研究,探讨Smaragdine如何支持下游工具链构建:一项研究关注BERT超参数调优的能耗影响对比,另一项则揭示BERT演进至其新一代模型ALBERT时的能耗行为演变规律。