Narrow bit-width data formats are key to reducing the computational and storage costs of modern deep learning applications. This paper evaluates Microscaling (MX) data formats that combine a per-block scaling factor with narrow floating-point and integer types for individual elements. MX formats balance the competing needs of hardware efficiency, model accuracy, and user friction. Empirical results on over two dozen benchmarks demonstrate practicality of MX data formats as a drop-in replacement for baseline FP32 for AI inference and training with low user friction. We also show the first instance of training generative language models at sub-8-bit weights, activations, and gradients with minimal accuracy loss and no modifications to the training recipe.
翻译:窄位宽数据格式是降低现代深度学习应用计算与存储成本的关键。本文评估了微观缩放(MX)数据格式,该格式将逐块缩放因子与窄位浮点及整数类型相结合,用于单独元素。MX格式在硬件效率、模型精度与用户适配复杂度之间取得平衡。基于超过二十个基准测试的经验结果表明,MX数据格式可作为基线FP32的低摩擦即插即用替代方案,用于人工智能推理与训练。我们还首次展示了在权重、激活值和梯度均低于8位的情况下,以最小精度损失且无需修改训练方案来训练生成式语言模型的实例。