Conduit: Programmer-Transparent Near-Data Processing Using Multiple Compute-Capable Resources in Solid State Drives

Rakesh Nadig,Vamanan Arulchelvan,Mayank Kabra,Harshita Gupta,Rahul Bera,Nika Mansouri Ghiasi,Nanditha Rao,Qingcai Jiang,Andreas Kosmas Kakolyris,Yu Liang,Mohammad Sadrosadati,Onur Mutlu

from arxiv, To appear in IEEE International Symposium on High-Performance Computer Architecture (HPCA) 2026

Solid-state drives (SSDs) are well suited for near-data processing (NDP) because they: (1) store large application datasets, and (2) support three NDP paradigms: in-storage processing (ISP), processing using DRAM in the SSD (PuD-SSD), and in-flash processing (IFP). A large body of prior SSD-based NDP techniques operate in isolation, mapping computations to only one or two NDP paradigms (i.e., ISP, PuD-SSD, or IFP) within the SSD. These techniques (1) are tailored to specific workloads or kernels, (2) do not exploit the full computational potential of an SSD, and (3) lack programmer-transparency. While several prior works propose techniques to partition computation between the host and near-memory accelerators, adapting these techniques to SSDs has limited benefits because they (1) ignore the heterogeneity of the SSD resources, and (2) make offloading decisions based on limited factors such as bandwidth utilization, or data movement cost. We propose Conduit, a general-purpose, programmer-transparent NDP framework for SSDs that leverages multiple SSD computation resources. At compile time, Conduit executes a custom compiler (e.g., LLVM) pass that (i) vectorizes suitable application code segments into SIMD operations that align with the SSD's page layout, and (ii) embeds metadata (e.g., operation type, operand sizes) into the vectorized instructions to guide runtime offloading decisions. At runtime, within the SSD, Conduit performs instruction-granularity offloading by evaluating six key features, and uses a cost function to select the most suitable SSD resource. We evaluate Conduit and two prior NDP offloading techniques using an in-house event-driven SSD simulator on six data-intensive workloads. Conduit outperforms the best-performing prior offloading policy by 1.8x and reduces energy consumption by 46%.

翻译：固态硬盘（SSD）非常适合进行近数据处理（NDP），原因在于：（1）其存储大量应用数据集；（2）支持三种NDP范式：存储内处理（ISP）、利用SSD中DRAM的处理（PuD-SSD）以及闪存内处理（IFP）。现有大量基于SSD的NDP技术独立运行，仅将计算映射到SSD中的一种或两种NDP范式（即ISP、PuD-SSD或IFP）。这些技术（1）针对特定工作负载或计算内核定制；（2）未能充分利用SSD的全部计算潜力；（3）缺乏程序员透明性。虽然已有若干研究提出在主机与近内存加速器之间划分计算的技术，但这些技术适配SSD的收益有限，因为它们（1）忽视了SSD资源的异构性；（2）仅基于带宽利用率或数据移动成本等有限因素做出卸载决策。我们提出Conduit，一个面向SSD的通用、程序员透明的NDP框架，可充分利用SSD的多种计算资源。在编译时，Conduit执行自定义编译器（如LLVM）传递，该传递（i）将适用的应用程序代码段向量化为与SSD页面布局对齐的SIMD操作；（ii）在向量化指令中嵌入元数据（如操作类型、操作数大小）以指导运行时卸载决策。在运行时，Conduit在SSD内部通过评估六个关键特征进行指令粒度卸载，并利用成本函数选择最合适的SSD资源。我们使用自主研发的事件驱动型SSD模拟器，在六种数据密集型工作负载上评估Conduit及两种先前的NDP卸载技术。Conduit的性能表现优于先前最佳卸载策略1.8倍，同时降低能耗46%。