DRAM is the primary technology used for main memory in modern systems. Unfortunately, as DRAM scales down to smaller technology nodes, it faces key challenges in both data integrity and latency, which strongly affect overall system reliability, security, and performance. To develop reliable, secure, and high-performance DRAM-based main memory for future systems, it is critical to rigorously characterize, analyze, and understand various aspects (e.g., reliability, retention, latency, RowHammer vulnerability) of existing DRAM chips and their architecture. The goal of this dissertation is to 1) develop techniques and infrastructures to enable such rigorous characterization, analysis, and understanding, and 2) enable new mechanisms to improve DRAM performance, reliability, and security based on the developed understanding. To this end, in this dissertation, we 1) design, implement, and prototype a new practical-to-use and flexible FPGA-based DRAM characterization infrastructure (called SoftMC), 2) use the DRAM characterization infrastructure to develop a new experimental methodology (called U-TRR) to uncover the operation of existing proprietary in-DRAM RowHammer protection mechanisms and craft new RowHammer access patterns to efficiently circumvent these RowHammer protection mechanisms, 3) propose a new DRAM architecture, called SelfManaging DRAM, for enabling autonomous and efficient in-DRAM maintenance operations that enable not only better performance, efficiency, and reliability but also faster and easier adoption of changes to DRAM chips, and 4) propose a versatile DRAM substrate, called the Copy-Row (CROW) substrate, that enables new mechanisms for improving DRAM performance, energy consumption, and reliability.
翻译:DRAM是现代系统中主存的主要技术。然而,随着DRAM向更小技术节点微缩,其在数据完整性和延迟方面面临关键挑战,这严重影响整体系统的可靠性、安全性与性能。为未来系统开发可靠、安全且高性能的基于DRAM的主存,必须严格表征、分析并理解现有DRAM芯片及其架构的多个方面(如可靠性、数据保持、延迟、RowHammer漏洞)。本论文的目标是:1)开发能够实现此类严谨表征、分析与理解的技术与基础设施,2)基于所形成的理解,设计能提升DRAM性能、可靠性与安全性的新机制。为此,本论文:1)设计、实现并原型化了一种新型实用且灵活的基于FPGA的DRAM表征基础设施(称为SoftMC);2)利用该DRAM表征基础设施开发了一种新的实验方法(称为U-TRR),以揭示现有专有DRAM内部RowHammer保护机制的工作原理,并设计新型RowHammer访问模式以高效绕过这些保护机制;3)提出一种名为自管理DRAM(SelfManaging DRAM)的新型DRAM架构,支持自主且高效的DRAM内维护操作,不仅带来更优的性能、效率与可靠性,还能更快更便捷地实现对DRAM芯片变更的采纳;4)提出一种称为复制行(Copy-Row,CROW)基板的多功能DRAM基板,支持多种提升DRAM性能、能耗与可靠性的新机制。