Gerrymandering the Warp: Non-Control-Data Attacks on CUDA Collective Decision

CUDA collective operations often sit on security decision paths: votes accept batches, reductions aggregate evidence, shuffles select representatives, and barriers order checked state before use. Such decisions depend not only on computed values, but also on which lanes are represented, what evidence they contribute, which lane speaks for the group, and which checked state reaches commit. We identify this participation metadata as decision-making non-control data. We define Collective Semantic Corruption (CSC), a non-control-data attack family in which range-valid masks, predicates, source lanes, descriptors, group labels, or epochs cause a CUDA-conforming collective to authorize a decision over the wrong membership, contribution, role, or validation-to-use state. The kernel reaches the intended collective site and executes the expected primitive; the primitive represents the wrong authority set. We model CSC with a site-local participation-authority contract. A protected collective derives, recomputes, checks, or freezes membership, contribution, role, and temporal state before authorization. We evaluate CSC across NVIDIA CUDA collective primitives, trigger channels, compact workload-style kernels, reduced idiom bridges, and admission-guard harnesses. In a CUDA-defined contract-conformance suite spanning the four authority dimensions, corrupted participation metadata causes a trusted-reference mismatch in 102/102 instances, while hardened variants preserve that reference in 102/102. We report 13 synchronization-sensitive instances separately. We then introduce Collective Integrity Contracts (CIC), a wrapper discipline that binds participation metadata before collective use. For CUDA collective decisions, security depends on both the values computed and the participants represented.

翻译：CUDA集体操作常处于安全决策路径上：投票决定批次接收，归约聚合证据，洗牌选择代表，屏障确保在使用前对已检查状态进行排序。此类决策不仅依赖于计算值，还取决于哪些执行通道被代表、它们贡献了哪些证据、哪个通道代表该组发言、以及哪个已检查状态达到提交。我们将此类参与元数据识别为决策型非控制数据。我们定义了集体语义破坏（CSC），这是一种非控制数据攻击家族，其中范围有效的掩码、谓词、源通道、描述符、组标签或轮次会导致符合CUDA规范的集体操作对错误成员、贡献、角色或待验证状态授权决策。内核到达预期的集体操作位置并执行预期原语，但该原语代表了错误的权威集合。我们通过站点本地参与-权威契约对CSC进行建模。受保护的集体操作在授权前会派生、重新计算、检查或冻结成员资格、贡献、角色及时间状态。我们在NVIDIA CUDA集体原语、触发通道、紧凑工作负载风格内核、简化惯用桥接以及准入防护框架上评估了CSC。在覆盖四个权威维度的CUDA契约合规套件中，被篡改的参与元数据导致102/102个实例出现可信参考不匹配，而加固变体在102/102个实例中保持了该参考。我们单独报告了13个同步敏感实例。随后，我们引入了集体完整性契约（CIC），这是一种在集体操作使用前绑定参与元数据的封装规范。对于CUDA集体决策而言，安全性既依赖于计算的值，也依赖于被代表的参与者。