AERO: Adaptive Erase Operation for Improving Lifetime and Performance of Modern NAND Flash-Based SSDs

from arxiv, Accepted for publication at Proceedings of the 29th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2024

This work investigates a new erase scheme in NAND flash memory to improve the lifetime and performance of modern solid-state drives (SSDs). In NAND flash memory, an erase operation applies a high voltage (e.g., > 20 V) to flash cells for a long time (e.g., > 3.5 ms), which degrades cell endurance and potentially delays user I/O requests. While a large body of prior work has proposed various techniques to mitigate the negative impact of erase operations, no work has yet investigated how erase latency should be set to fully exploit the potential of NAND flash memory; most existing techniques use a fixed latency for every erase operation which is set to cover the worst-case operating conditions. To address this, we propose AERO (Adaptive ERase Operation), a new erase scheme that dynamically adjusts erase latency to be just long enough for reliably erasing target cells, depending on the cells' current erase characteristics. AERO accurately predicts such near-optimal erase latency based on the number of fail bits during an erase operation. To maximize its benefits, we further optimize AERO in two aspects. First, at the beginning of an erase operation, AERO attempts to erase the cells for a short time (e.g., 1 ms), which enables AERO to always obtain the number of fail bits necessary to accurately predict the near-optimal erase latency. Second, AERO aggressively yet safely reduces erase latency by leveraging a large reliability margin present in modern SSDs. We demonstrate the feasibility and reliability of AERO using 160 real 3D NAND flash chips, showing that it enhances SSD lifetime over the conventional erase scheme by 43% without change to existing NAND flash chips. Our system-level evaluation using eleven real-world workloads shows that an AERO-enabled SSD reduces read tail latency by 34% on average over a state-of-the-art technique.

翻译：本研究针对NAND闪存中的擦除操作提出新方案，旨在提升现代固态硬盘（SSD）的寿命与性能。在NAND闪存中，擦除操作需对存储单元施加高电压（如>20V）并持续较长时间（如>3.5ms），这会降低单元耐久性并可能延迟用户I/O请求。尽管大量前期研究提出了多种缓解擦除操作负面影响的技术，但尚无工作探讨如何设置擦除延迟以充分挖掘NAND闪存潜力——现有技术大多对每次擦除操作采用固定延迟，该延迟设定为覆盖最差工况。为此，我们提出AERO（自适应擦除操作），这是一种根据存储单元当前擦除特性动态调整擦除延迟，使其恰好足以可靠擦除目标单元的新方案。AERO基于擦除操作期间的失败位数量准确预测此类近最优擦除延迟。为最大化收益，我们从两方面进一步优化AERO：首先，在擦除操作初始阶段，AERO以短时（如1ms）尝试擦除单元，从而始终获取准确预测近最优擦除延迟所需的失败位数量；其次，AERO利用现代SSD中存在的较大可靠性余量，激进且安全地降低擦除延迟。我们通过160块真实3D NAND闪存芯片验证了AERO的可行性与可靠性，显示其在无需改变现有NAND闪存芯片的情况下，将SSD寿命较传统擦除方案提升43%。基于11个真实工作负载的系统级评估表明，采用AERO的SSD在读取尾部延迟方面较现有最优技术平均降低34%。