This paper delves into a comprehensive analysis of fault-tolerant memory systems, focusing on recovery techniques modeled using Markov chains to address transient errors. The study revolves around the application of scrubbing methods in conjunction with Single Error Correction and Double Error Detection (SEC-DED) codes. It explores three primary models: 1) Exponentially distributed scrubbing, involving periodic checks of memory words within exponentially distributed time intervals; 2) Deterministic scrubbing, featuring regular, periodic word checks; and 3) Mixed scrubbing, which combines both probabilistic and deterministic scrubbing approaches. The research encompasses the estimation of reliability and Mean Time to Failure (MTTF) values for each model. Notably, the findings highlight the superior performance of mixed scrubbing over simpler scrubbing methods in terms of reliability and MTTF.
翻译:本文深入分析了容错内存系统的综合可靠性,重点研究了使用马尔可夫链建模的恢复技术以应对瞬时错误。研究围绕结合单错误纠正双错误检测(SEC-DED)码的擦洗方法展开。探讨了三种主要模型:1)指数分布擦洗,即在指数分布时间间隔内对存储字进行周期性检查;2)确定性擦洗,即进行规则、周期性的字检查;3)混合擦洗,即结合概率性和确定性擦洗方法。研究包括对每种模型的可靠性和平均无故障时间(MTTF)值的估计。值得注意的是,研究结果凸显了混合擦洗在可靠性和MTTF方面优于简单擦洗方法的性能。