[DATE] Twin ECC: A Data Duplication Based ECC for Strong DRAM Error Re…

SMRL 0 614

Hyeong Kon Bae, Myung Jae Chung, Young-Ho Gong, and Sung Woo Chung, “Twin ECC: A Data Duplication Based ECC for Strong DRAM Error Resilience”, Design, Automation and Test in Europe Conference (DATE 2023), Antwerp, Belgium, April 2023.

 

 

 

 

Abstract

With the continuous scaling of process technology, DRAM reliability has become a critical challenge in modern memory systems. Currently, DRAM memory systems for servers have employed ECC DIMMs with a single error correction and double error detection (SECDED) code. However, the SECDED code is insufficient to ensure DRAM reliability since memory systems become more susceptible to errors. Though various studies have proposed multi-bit correctable ECC schemes, such ECC schemes cause performance and/or storage overhead. To minimize performance degradation while providing strong error resilience, in this paper, we propose Twin ECC, a low-cost memory protection scheme through data duplication. In a 512-bit data, Twin ECC duplicates meaningful data into meaningless zeros. Since ‘1’→‘0’ error pattern is dominant in DRAM cells, Twin ECC provides strong error resilience by performing bitwise OR operations between the original meaningful data and duplicated data. After the bitwise OR operations, Twin ECC adopts the SECDED code for further enhancing data protection. Our evaluations show that Twin ECC reduces the system failure probability by average 64.8%, 56.9%, and 49.6%, when the portion of ‘1’→‘0’ error is 100%, 90%, and 80%, respectively, while causing only 0.7% performance overhead and no storage overhead compared to the baseline ECC DIMM with SECDED code.

 

Comments