Solid-state storage architectures based on NAND or emerging memory devices (SSD), are fundamentally architected and optimized for both reliability and performance. Achieving these simultaneous goals requires co-design of memory components with firmware-architected Error Management (EM) algorithms for density- and performance-scaled memory technologies. We describe a Machine Learning (ML) for systems methodology and modeling for co-designing the EM subsystem together with the natural variance inherent to scaled silicon process of memory components underlying SSD technology. The modeling analyzes NAND memory components and EM algorithms interacting with comprehensive suite of synthetic (stress-focused and JEDEC) and emulation (YCSB and similar) workloads across Flash Translation abstraction layers, by leveraging a statistically interpretable and intuitively explainable ML algorithm. The generalizable co-design framework evaluates several thousand datacenter SSDs spanning multiple generations of memory and storage technology. Consequently, the modeling framework enables continuous, holistic, data-driven design towards generational architectural advancements. We additionally demonstrate that the framework enables Representation Learning of the EM-workload domain for enhancement of the architectural design-space across broad spectrum of workloads.
翻译:基于NAND或新兴存储器件(SSD)的固态存储架构,其基础架构与优化均围绕可靠性与性能双重目标展开。要实现这些并行目标,需要对内存组件与固件架构的错误管理(EM)算法进行协同设计,以适应高密度与高性能扩展的内存技术。我们提出一种面向系统的机器学习(ML)方法论与建模框架,用于协同设计EM子系统与SSD技术底层内存组件硅工艺缩放所固有的自然变异。该建模方法通过采用具备统计可解释性与直观可理解性的ML算法,分析NAND内存组件与EM算法在闪存转换抽象层中与综合测试(侧重压力测试及JEDEC标准)及仿真(YCSB及类似)工作负载套件的交互过程。这一可泛化的协同设计框架评估了涵盖多代内存与存储技术的数千个数据中心SSD。因此,该建模框架能够为实现代际架构演进提供持续、整体、数据驱动的设计支持。我们还证明该框架可实现EM-工作负载域的表征学习,从而拓展面向广泛工作负载谱系的架构设计空间。