To improve efficiency, nearly all parallel processing units (CPUs and GPUs) implement relaxed memory models in which memory operations may be re-ordered, i.e., executed out-of-order. Prior testing work in this area found that memory re-orderings are observed more frequently when other cores are active, e.g., stressing the memory system, which likely triggers aggressive hardware optimizations. In this work, we present Memory DisOrder: a timerless side-channel that uses memory re-orderings to infer activity on other processes. We first perform a fuzzing campaign and show that many mainstream processors (X86/Arm/Apple CPUs, NVIDIA/AMD/Apple GPUs) are susceptible to cross-process signals. We then show how the vulnerability can be used to implement classic attacks, including a covert channel, achieving up to 16 bits/second with 95% accuracy on an Apple M3 GPU, and application fingerprinting, achieving reliable closed-world DNN architecture fingerprinting on several CPUs and an Apple M3 GPU. Finally, we explore how low-level system details can be exploited to increase re-orderings, showing the potential for a covert channel to achieve nearly 30K bits/second on X86 CPUs. More precise attacks can likely be developed as the vulnerability becomes better understood.
翻译:为提高效率,几乎所有并行处理单元(CPU和GPU)均采用宽松内存模型,允许内存操作被重排序,即乱序执行。该领域先前测试工作发现,当其他核心处于活跃状态(例如对内存系统施加压力)时,内存重排序现象出现频率显著增加,这很可能触发了硬件的激进优化机制。本研究提出Memory DisOrder:一种利用内存重排序推断其他进程活动的无计时器侧信道。我们首先开展模糊测试活动,证明多种主流处理器(X86/Arm/Apple CPU、NVIDIA/AMD/Apple GPU)均存在跨进程信号泄露风险。随后展示如何利用该漏洞实施经典攻击:包括在Apple M3 GPU上实现传输速率达16比特/秒、准确率95%的隐蔽信道,以及在多款CPU和Apple M3 GPU上实现可靠的闭集DNN架构指纹识别。最后,我们探究如何利用底层系统细节增强重排序效应,揭示在X86 CPU上构建传输速率近30K比特/秒隐蔽信道的潜力。随着对该漏洞认知的深化,未来很可能开发出更精确的攻击方式。