Numerical features of matrix multiplier hardware units in NVIDIA and AMD data centre GPUs have recently been studied. Features such as rounding, normalisation, and internal precision of the accumulators are of interest. In this paper, we extend the methodology for analysing those features, to consumer-grade NVIDIA GPUs by implementing an architecture-independent test scheme for various input and output precision formats. Unlike current approaches, the proposed test vector generation method neither performs an exhaustive search nor relies on hard-coded {constants that are device-specific, yet remains applicable to a wide range of mixed-precision formats. We have applied the scheme to the RTX-3060 (Ampere architecture), and Ada RTX-1000 (Ada Lovelace architecture) graphics cards and determined numerical features of matrix multipliers for binary16, TensorFloat32, and bfloat16 input floating point formats and binary16 and binary32 IEEE 754 output formats. Our methodology allowed us to determine that} the numerical features of RTX-3060, a consumer-grade GPU, are identical to those of the A100, a data centre GPU. We do not expect our code to require any changes for performing analysis of matrix multipliers on newer NVIDIA GPUs, Hopper or Blackwell, and their future successors, and any input/output format combination, including the latest 8-bit floating-point formats.
翻译:近期已对NVIDIA和AMD数据中心GPU中矩阵乘法器硬件单元的数值特征展开研究,其中累加器的舍入、归一化及内部精度等特征备受关注。本文通过实现一种与架构无关的测试方案,将该特征分析方法论扩展至消费级NVIDIA GPU,该方案适用于多种输入输出精度格式。与现有方法不同,所提出的测试向量生成方法既不进行穷举搜索,也不依赖设备特定的硬编码常数,同时仍适用于广泛的混合精度格式。我们将该方案应用于RTX-3060(安培架构)与Ada RTX-1000(艾达·洛芙莱斯架构)显卡,确定了针对binary16、TensorFloat32及bfloat16输入浮点格式,以及binary16和binary32 IEEE 754输出格式的矩阵乘法器数值特征。我们的方法论表明:消费级GPU RTX-3060的数值特征与数据中心GPU A100完全一致。我们预期所开发的代码无需修改即可用于分析新一代NVIDIA GPU(Hopper或Blackwell及其后续架构)的矩阵乘法器,并适用于包括最新8位浮点格式在内的任意输入/输出格式组合。