



#### Novel ECC Architecture Enhances Embedded Storage System Reliability

**Jeff Yang** Principle Engineer Silicon Motion



- New NAND Flash
  - Advanced smaller process nodes
  - Double capacity, but with half endurance capability





- Floating gate area = Gate length x Gate width
- Smaller gate area  $\rightarrow$  Lower cost + Worse reliability
- ECC is increasingly important to NAND flash.



Flash Memory Summit 2012 Santa Clara, CA



- Fixed code rate: around 0.9, ECC chunk size: 1KB/ 2KB/ 4KB
- Hard-decoding is based on BCH, and soft-decoding is based on LDPC with less than 3-bit channel reliability values.
  - Correction Performance: 4KB better than 1KB
  - Decoding Latency: 1KB better than 4KB



Flash Memory Summit 2012 Santa Clara, CA



### Endurance vs. Retention with Hard-Decoding

| 1KB BCH F  | Protecti | ion  |      |      |      |      |
|------------|----------|------|------|------|------|------|
| 84 hrs     | Ο        | Х    | Х    | Х    | Х    | Х    |
| 74 hrs     | 0        | Х    | х    | х    | х    | Х    |
| 68 hrs     | 0        | х    | Х    | х    | х    | Х    |
| 60 hrs     | 0        | х    | Х    | х    | х    | Х    |
| 52 hrs     | 0        | Х    | Х    | х    | Х    | Х    |
| 44 hrs     | 0        | Х    | Х    | х    | Х    | Х    |
| 36 hrs     | 0        | Х    | Х    | х    | Х    | х    |
| 28 hrs     | 0        | Х    | Х    | х    | Х    | х    |
| 24 hrs     | 0        | Х    | Х    | х    | Х    | Х    |
| 20 hrs     | 0        | Х    | Х    | Х    | Х    | х    |
| 16 hrs     | 0        | Х    | Х    | х    | Х    | х    |
| 12 hrs     | 0        | 0    | Х    | х    | Х    | Х    |
| 8 hrs      | 0        | 0    | Х    | х    | Х    | х    |
| 4 hrs      | 0        | О    | 0    | х    | х    | Х    |
| Endurance  | 0        | 0    | 0    | 0    | 0    | 0    |
| P/E Cycles | 600      | 1200 | 1800 | 2400 | 3000 | 3600 |

All data sectors are correctable At least one data sector is uncorrectable

- 2ynm TLC ۲
- 1KB-based BCH ۲
- Room Temp. Burn-in ۲
- 120 °C Bake for Different Durations ۲
- Data retention is much more important because NAND flash is NV-memory.
- High endurance and poor data retention





## Endurance vs. Retention with Soft-Decoding

#### 1KB LDPC Protection

| 84 hrs     | 0   | 0    | 0    | Х    | Х    | x    |
|------------|-----|------|------|------|------|------|
| 74 hrs     | 0   | 0    | 0    | х    | Х    | x    |
| 68 hrs     | 0   | 0    | 0    | х    | Х    | x    |
| 60 hrs     | 0   | 0    | 0    | 0    | Х    | x    |
| 52 hrs     | 0   | 0    | 0    | 0    | Х    | x    |
| 44 hrs     | 0   | 0    | 0    | 0    | Х    | x    |
| 36 hrs     | 0   | 0    | 0    | 0    | Х    | x    |
| 28 hrs     | 0   | 0    | 0    | 0    | 0    | x    |
| 24 hrs     | 0   | 0    | 0    | 0    | 0    | x    |
| 20 hrs     | 0   | 0    | 0    | 0    | 0    | x    |
| 16 hrs     | 0   | 0    | 0    | 0    | 0    | x    |
| 12 hrs     | 0   | 0    | 0    | 0    | 0    | x    |
| 8 hrs      | 0   | 0    | 0    | 0    | 0    | x    |
| 4 hrs      | 0   | 0    | 0    | Ο    | 0    | x    |
| Endurance  | 0   | 0    | 0    | 0    | 0    | 0    |
| P/E Cycles | 600 | 1200 | 1800 | 2400 | 3000 | 3600 |

All data sectors are correctable At least one data sector is uncorrectable

- 2ynm TLC
- 1KB-based LDPC
- Room Temp. Burn-in
- 120°C Bake for Different Durations
- The endurance is lager than 3.6K. It is around two times compared to BCH.
- Under the same data retention condition, LDPC offers three times greater protection than BCH.

SiliconMot







- The Vth distribution shifts down with the noise variance increasing.
- The curve is more Gaussian-like, bell-shaped.



Flash Memory Summit 2012 Santa Clara, CA





- The Vth distribution shifts to the right.
- The noise variance is also increasing.
- The non-Gaussian parts are generated by endurance disturbance.



### **Comparison of Error Histogram**



Santa Clara, CA





| LLR Value                | 1        | 2        | 3        | 4        | 5        | 6        | 7        | 8        |
|--------------------------|----------|----------|----------|----------|----------|----------|----------|----------|
| AWGN                     | -4.08764 | -2.47504 | -1.48491 | -0.49495 | 0.49495  | 1.484909 | 2.475038 | 4.087639 |
| Real NAND<br>(P/E = 14K) | -1.87216 | -1.51755 | -0.92549 | -0.3199  | 0.241576 | 0.837346 | 1.47941  | 2.343726 |

- Based on 1KB ECC chunk as an observation space.
- The endurance increases the occurrence of strong errors.
- When  $P/E \ge 5K$ , all the uncorrectable codewords have the same noise problem.



- Strong errors (long and flat tail) in the endurance test
  - When P/E is around 5K, the RBER in one block is 2.6e-3.
    But the worst-case error profile will cause the first uncorrectable codeword.
- If one uses soft-decoding to extend the endurance, the strong errors may be overcome.
  - Extend the ECC chunk size (2KB, 4KB)
    - ➔ Decoding latency can be increased.
  - Reduce the code rate
    - → Storage capacity can be decreased.





### **Raid-like Protection Within One Block**

- An example on TLC
  - 174 word-lines, 522 logical pages
- The last 10 pages for vertical parity, 512 pages for real user data.
  - Each bit column provides 1-bit correction and detects the mis-correction.
- The small-sized ECC chunks will be concatenated into a super long code.
  - Overcome partly non-AWGN noises
- Only extra 2% protection area will increase the correction capability. (10/522 = 2%)
  - Decoding latency will be extended
- Dynamically change the extra protection area.
  - Protection area: from 2% to 4% or 8%
  - Reduce the decoding latency, and increase the correction capability.



#### 8 Column Groups in One Block

| 1KB Message          | Horizontal Parity   |  |  |
|----------------------|---------------------|--|--|
|                      |                     |  |  |
|                      |                     |  |  |
|                      |                     |  |  |
|                      |                     |  |  |
|                      |                     |  |  |
|                      |                     |  |  |
|                      |                     |  |  |
| VP (Vertical Parity) | Parity on<br>Parity |  |  |

Test Result of 2ynm TLC under extra 1% Protection:

- BCH horizontal hard-decoding
  - > Endurance P/E = n
- LDPC horizontal soft-decoding
  - Endurance P/E = 1.5~2n
- LDPC with Rail-like protection
  - Endurance P/E = ~4n

## Memory Flexible Protection in 2 Directions



- The vertical protection is programmable.
- The horizontal LDPC code rate is also programmable.
- Accordingly the different kinds of disturbance on various NAND types can be covered and overcome in this protection scheme for all kinds of applications.





Application

Apply ation D

**Application A** 

Endurance

Application C

Retention

# Strategies Applicable for Different Applications

- App A: Cache SSD
  - Temporary data storage
  - Data retention capability not important.
  - Lower code rate + higher decoding efficiency
- App B: Read-Only Storage
  - Data retention as the first priority
  - Detect the read-disturbance
  - Detect the retention-disturbance
- App C: General Consumer (uSD, USB)
  - Capacity is important
  - BCH may be enough, but high reliability still relies on LDPC. Cannot use complicated protection.
- App D: Consumer SSD (eMMC)
  - Care the data-retention and the endurance both
  - Need LDPC to provide higher reliability
  - Long decoding latency not acceptable



Flash Memory Summit 2012 Santa Clara, CA



### THANK YOU! Q & A

**Disclaimer Notice** 

Although efforts were made to verify the completeness and accuracy of the information contained in this presentation, it is provided "as is" as of the date of this document and always subject to change.

Flash Memory Summit 2012 Santa Clara, CA

