Hardware peripherals such as GPUs and FPGAs are commonly available in server-grade computing to accelerate specific compute tasks, from database queries to machine learning. CSPs have integrated these accelerators into their infrastructure and let tenants combine and configure these components flexibly, based on their needs. Securing I/O interfaces is critical to ensure proper isolation between tenants in these highly complex, heterogeneous, yet shared server systems, especially in the cloud, where some peripherals may be under control of a malicious tenant. In this work, we investigate the interfaces that connect peripheral hardware components to each other and the rest of the system.We show that the I/O memory management units (IOMMUs) - intended to ensure proper isolation of peripherals - are the source of a new attack surface: the I/O translation look-aside buffer (IOTLB). We show that by using an FPGA accelerator card one can gain precise information over IOTLB activity. That information can be used for covert communication between peripherals without bothering CPU or to directly extract leakage from neighboring accelerated compute jobs such as GPU-accelerated databases. We present the first qualitative and quantitative analysis of this newly uncovered attack surface before fine-grained channels become widely viable with the introduction of CXL and PCIe 5.0. In addition, we propose possible countermeasures that software developers, hardware designers, and system administrators can use to suppress the observed side-channel leakages and analyze their implicit costs.
翻译:诸如GPU和FPGA等硬件外设在服务器级计算中已普遍可用,用于加速从数据库查询到机器学习等特定计算任务。云服务提供商将这些加速器集成到其基础设施中,并允许租户根据自身需求灵活组合和配置这些组件。在高度复杂、异构且共享的服务器系统(尤其是云端)中,确保I/O接口的安全性对于维持租户间的适当隔离至关重要——因为某些外设可能处于恶意租户的控制之下。本研究调查了连接各硬件外设组件及系统其余部分的接口。我们证明:旨在确保外设间适当隔离的I/O内存管理单元(IOMMU)构成了一类新的攻击面——I/O转译后备缓冲器(IOTLB)。通过使用FPGA加速卡,我们可以精确获取IOTLB活动信息。这些信息可用于在外设之间建立无需CPU干预的隐蔽通信信道,或直接提取相邻加速计算任务(如GPU加速数据库)中的泄漏信息。我们在CXL和PCIe 5.0技术使细粒度信道广泛实用之前,首次对这类新发现的攻击面进行了定性与定量分析。此外,我们还提出了软件开发者、硬件设计者和系统管理员可用于抑制所观察到的侧信道泄漏的潜在对策,并分析了这些对策的隐含成本。