Kernel Contracts: A Specification Language for ML Kernel Correctness Across Heterogeneous Silicon

Every ML kernel ships with an implicit contract about what it computes. People rarely write the contract down. When two kernels disagree -- when a matmul on AMD produces a different gradient than the same matmul on NVIDIA, when a fused attention kernel silently downcasts an accumulator, when an out-of-bounds access returns zero on one stack and garbage on another -- there is no formal artifact to arbitrate the dispute. Recent empirical work has measured the gap across silicon platforms, but none of it specifies the contract being violated. We present a specification language for kernel contracts. A contract has eight parts: identifier, scope, precondition, postcondition, tolerance, reference oracle, measurement protocol, and violation signature. We use it to state twelve contract classes covering precision, ordering, compiler-induced, and exceptional-value failure modes, each grounded in published empirical evidence. We require a three-state calibration: every contract must admit at least one reference-conforming implementation and at least one contract-violating implementation that passes basic functional tests. We apply the framework to three documented incidents -- Huawei Ascend silent precision coercion, Sakana AI CUDA Engineer reward hacking, AMD out-of-bounds silent acceptance -- and show that each informal diagnosis maps to a specific contract violation with a measurable signature. A kernel contract suite is a normative reference against which conformance can be graded, in the way that ISASecure grades industrial control systems against IEC 62443.

翻译：每个机器学习内核都隐含着关于其计算功能的契约。人们很少将这些契约明确记录下来。当两个内核产生分歧时——例如AMD与NVIDIA的矩阵乘法生成不同梯度、融合注意力内核悄无声息地降级累加器、某个堆栈上越界访问返回零而另一堆栈返回垃圾数据——便缺乏正式工件来仲裁争端。近期实证工作虽已测量不同硅平台间的差异，却均未明确说明被违反的契约。我们提出一种用于内核契约的规范语言。该契约包含八个组成部分：标识符、作用域、前置条件、后置条件、容差、参考预言机、测量协议及违规特征。基于已发表的实证证据，我们利用该语言定义了覆盖精度、顺序、编译器诱发及异常值失效模式的十二类契约。我们要求采用三态校准：每份契约必须至少存在一个符合参考基准的实现，以及至少一个虽通过基础功能测试却违反契约的实现。我们将该框架应用于三起已知事件——华为昇腾静默精度强制转换、Sakana AI CUDA工程师奖励黑客行为、AMD越界静默接受——并证明每起非正式诊断均可映射为具有可测量特征的特定契约违规行为。内核契约集作为规范性参考，可参照ISASecure对工业控制系统进行IEC 62443合规性评级的方式，对内核的一致性进行分级评估。