ECC Techniques for Enabling DRAM Caches with Off-Chip Tag Arrays

Error correcting codes (ECCs) are widely used to provide protection against
data integrity problems in memory. With continue scaling of technology and
lowering of supply voltage, failures in memory are becoming more prevalent.
Moreover, the usage and organization of DRAM have also been expanded.
Integrating a large-scale DRAM cache is a promising solution to address the
memory bandwidth challenge, and this is becoming more compelling with
3D-stacking technology. To enable a high-performance DRAM cache, previous
works have proposed storing a tag array off-chip with data in DRAM. The
tag-and-data access inevitably changes the traditional access pattern to memory
and brings new challenges to ECC schemes due to the granularity of the
data access. How to design efficient ECC for new memory usage within the
restrictions of commercial DIMMs has emerged as a new challenge.

In this article, we propose two new ECC techniques, Hybrid ECC and Direct
ECC Compare. Hybrid ECC is a linear ECC that uses the same bit overhead
as a Double Error Correction, Triple Error Detection (DEC-TED) ECC,
but provides error correction for more frequent burst error patterns. Direct
ECC Compare eliminates the delay and gate overhead caused by comparing
multiple encoded words in parallel. The design for off-chip tag storage falls
into two major categories, distributed and continuous. For distributed tag
storage, we propose to store tags in the ECC chip, protected by Hybrid ECC.
For continuous tag storage, we propose separate ECCs for each individual tag
to reduce the bandwidth overhead for tag update and the use of Direct ECC
Compare to improve the matching latency of encoded tags. A design based on
16-way set associative cache shows that a 30-percent gate count reduction and
a 12-percent latency reduction are achieved.

Download (219.12 KB)