Recent CUDA exploitation work shows that GPU memory bugs can escalate into device-side control-flow corruption, as kernels later consume corrupted return continuations, function pointers, dispatch-table entries, or branch targets. For deployed CUDA binaries, the relevant security boundary is executed NVIDIA SASS, after PTX lowering, inlining, ABI decisions, register allocation, spills, predication, and SIMT execution; source- or PTX-level policies do not capture this boundary. We present WarpGuard, to our knowledge the first protected-site CFI system for CUDA device binaries operating on executed SASS. WarpGuard enforces at protected sites: recovered SASS instructions or sequences that consume control-flow state, provide sufficient binary evidence to derive policy, are checked before release, and fail closed on violation. It authenticates backward-edge continuation state for instrumented returns, validates recoverable forward targets per site, and reports fixed-edge, unsupported, profile-excluded, fallback, and no-surface outcomes outside the protected denominator. On 77 CUDA artifacts, WarpGuard classifies 51,621 SASS control-flow sites, including 1,343 returns and 154 supported forward target-set entries, and records 52.2 million dynamic checks. In representative backward- and forward-edge corruption attacks, native execution reaches attacker-selected behavior, detect-only mode records the expected violation, and enforcement fails closed before releasing the invalid protected transfer. Public-code evidence shows that the same SASS consumption patterns occur in real CUDA systems, including runtime dispatch tables, cuFFT callbacks, generated callable tables, and uploaded device-function pointers. WarpGuard delivers auditable protected-site CFI for CUDA SASS and separates dynamic-instrumentation enforcement from callback-free SASS timing and patch-cache feasibility.
翻译:近期CUDA漏洞利用研究表明,GPU内存错误可升级为设备端控制流损坏——当后续核函数消耗被破坏的返回延续、函数指针、调度表条目或分支目标时。对于已部署的CUDA二进制程序,相关安全边界是经过PTX降级、内联、ABI决策、寄存器分配、溢出、预测执行及SIMT执行后的NVIDIA SASS指令集;源代码级或PTX级策略无法覆盖该边界。我们提出WarpGuard——据我们所知,这是首个在已执行SASS层面为CUDA设备二进制程序提供关键点CFI保护的系统。WarpGuard在关键点处实施以下机制:恢复消耗控制流状态的SASS指令或指令序列,提供足够二进制证据以推导策略,在释放前完成检查,并在此类检查失败时执行故障闭锁。它通过插桩后的返回指令验证后向边缘延续状态,针对每个关键点验证可恢复的前向目标,并将固定边缘、不支持的、通过配置文件排除的、回退及无表面结果报告为未受保护场景。在77个CUDA程序中,WarpGuard分类了51,621个SASS控制流关键点(包含1,343个返回指令和154个受支持的前向目标集条目),并记录了5,220万次动态检查。在典型后向与前向边缘破坏攻击实验中:原生执行抵达攻击者预设行为,仅检测模式记录预期违规事件,而实施保护时系统在释放无效受保护传输前完成故障闭锁。公开代码证据表明,真实CUDA系统(包括运行时调度表、cuFFT回调函数、生成的调用表及上传的设备函数指针)中存在相同的SASS消耗模式。WarpGuard为CUDA SASS提供了可审计的关键点控制流完整性保护,并将动态插桩的执行与无回调SASS时序及补丁缓存可行性相分离。