This paper proposes hardware converters for the microscaling format (MX-format), a reduced representation of floating-point numbers. We present an algorithm and a memory-free hardware model for converting 32 single-precision floating-point numbers to MX-format. The proposed model supports six different types of MX-format: E5M2, E4M3, E3M2, E2M3, E2M1, and INT8. The conversion process consists of three steps: calculating the maximum absolute value among 32 inputs, generating a shared scale, and producing 32 outputs in the selected MX-format type. The hardware converters were implemented in FPGA, and experimental results demonstrate.
翻译:本文提出了一种用于微缩放格式(MX格式)的硬件转换器,该格式是浮点数的一种简化表示。我们提出了一种算法及一种无存储硬件模型,用于将32个单精度浮点数转换为MX格式。所提出的模型支持六种不同类型的MX格式:E5M2、E4M3、E3M2、E2M3、E2M2、E2M1和INT8。转换过程包含三个步骤:计算32个输入中的最大绝对值、生成共享缩放因子,以及以选定的MX格式类型生成32个输出。该硬件转换器已在FPGA上实现,实验结果验证了其有效性。