

## Error Characterization and Comparison of ECCs on MLC and TLC Flash Memories

## Veeresh Taranalli, Eitan Yaakobi, Paul H. Siegel

Center for Magnetic Recording Research University of California, San Diego





- Flash Memory Basics
- Error Characterization
- Performance comparison of BCH, LDPC codes
  - Various decoding techniques
- Polar Codes
  - LP Decoding of Polar codes





### **CMMR – STAR Group Members**

Brian Butler

Scott Kayser

Xiaojie Zhang

Aman Bhatia

Minghai Qin



- Arrays ("blocks") of floating-gate transistors ("cells")
- A cell can support q voltage levels e.g., q = 2, 4, 8.
- Increasing the voltage level (program) of a cell is easy to do.
- To decrease a cell level, we must first erase its entire block, then re-program all cells.
- A block areas is eastly in time new and call waar











## Memory Flash Memory Structure - SLC

- Group of cells → Page
- Group of Pages → Block

#### **Typical SLC Block Layout**

| page 1  |
|---------|
| page 3  |
| page 5  |
| -       |
| -       |
| •       |
| page 63 |
|         |





## Memory Flash Memory Structure - MLC

■ 2 bits/cell → MSB and LSB pages

| Row   | MSB of                       | LSB of                       | MSB of               | LSB of        | MSB/LSB |
|-------|------------------------------|------------------------------|----------------------|---------------|---------|
| index | <b>first</b> 2 <sup>14</sup> | <b>first</b> 2 <sup>14</sup> | last 2 <sup>14</sup> | last $2^{14}$ | 01      |
|       | cells                        | cells                        | cells                | cells         |         |
| 0     | page 0                       | page 4                       | page 1               | page 5        | 00      |
| 1     | page 2                       | page 8                       | page 3               | page 9        |         |
| 2     | page 6                       | page 12                      | page 7               | page 13       | 10      |
| 3     | page 10                      | page 16                      | page 11              | page 17       |         |
| :     | •                            | •                            | •                    | •             | 11      |
| 30    | page 118                     | page 124                     | page 119             | page 125      |         |
| 31    | page 122                     | page 126                     | page 123             | page 127      |         |





## Memory Flash Memory Structure - TLC

#### • 3 bits/cell $\rightarrow$ MSB, CSB and LSB pages

| Row   | MSB of                       | CSB of                       | LSB of                       | MSB of               | CSB of               | LSB of               |
|-------|------------------------------|------------------------------|------------------------------|----------------------|----------------------|----------------------|
| index | <b>first</b> 2 <sup>16</sup> | <b>first</b> 2 <sup>16</sup> | <b>first</b> 2 <sup>16</sup> | last 2 <sup>16</sup> | last 2 <sup>16</sup> | last 2 <sup>16</sup> |
|       | cells                        | cells                        | cells                        | cells                | cells                | cells                |
| 0     | page 0                       |                              |                              | page 1               |                      |                      |
| 1     | page 2                       | page 6                       | page 12                      | page 3               | page 7               | page 13              |
| 2     | page 4                       | page 10                      | page 18                      | page 5               | page 11              | page 19              |
| 3     | page 8                       | <b>page</b> 16               | page 24                      | page 9               | page 17              | page 25              |
| 4     | page 14                      | page 22                      | page 30                      | page 15              | page 23              | page 31              |
| :     | •<br>•                       |                              | • •                          | •                    |                      | •<br>•               |
| 62    | page 362                     | page 370                     | page 378                     | page 363             | page 371             | page 379             |
| 63    | page 368                     | page 376                     |                              | page 369             | page 377             |                      |
| 64    | page 374                     | page 382                     |                              | page 375             | page 383             |                      |
| 65    | page 380                     |                              |                              | page 381             |                      |                      |





- Program/Erase (P/E) cycling of many blocks on MLC and TLC flash memories
- For each block the following steps were repeated:
  - The block is erased.
  - Pseudo-random data are programmed to the block.
  - The data are **read** and **errors** are identified.

#### **Disclaimers**:

- We measured many more P/E cycles than the manufacturer's guaranteed lifetime of the device.
- The experiments were done in laboratory conditions and related factors such as temperature change, intervals between erasures, or multiple readings before erasures were not considered.





## Experiment Setup Flash Chip Interface (Ming I Board)



Courtesy: Non-volatile Systems Laboratory, UCSD







### Experiment Setup FPGA Controller



Courtesy: Non-volatile Systems Laboratory, UCSD









Santa Clara, CA

Center for Magnetic Recording Research









### Memory Results – BER, TLC

#### BER of TLC Flash







### Results – BER per page, TLC







## ECC Comparison for TLC Flash

- BCH Codes
  - Length 2<sup>16</sup>
- LDPC Codes
  - Gallager codes (3,k)-regular, R=0.8, 0.9, 0.925, length 2<sup>16</sup>
  - AR4JA protograph-based codes, R=0.8, lengths 1280, 5120, 20480
  - MacKay codes variable-regular degree (3 or 4); no 4cycles, R=0.82, 0.87, 0.93; lengths 4095, 16383, 32000.
  - IEEE 802.3an\* (10Gb/s Ethernet), R ≈0.84, length 2048.

\* Djurdjevic et al., IEEE Commun. Letters, July 2003.





- BER computed for the first 100 iterations, then every 25<sup>th</sup> iteration from then on.
- Data averaged over 6 TLC blocks.
- BCH decoder: corrects error patterns with up to t errors; detects and leaves unchanged more than t errors.
- LDPC decoders: assume binary symmetric channel model BSC(p), with empirical error probability p.





- Sum-product algorithm (SPA)
  - Floating-point, max iterations 200
  - (5+1)-bit quasi-uniform quantization
- Min-sum algorithm (MSA)
  - No LLR limits, max iterations 200
- Linear programming (LP) decoding
  - Alternating Direction Method of Multipliers (ADMM)\* with new fast "projection step"

\* Barman, et al., Proc. 46th Allerton Conference, Sept. 2011.





# Rate ≈ 0.8, LDPC codes with SPA Decoding

BER of Different Codes of Rate  $\approx 0.8$ 



Flash Memory Summit 2014 Santa Clara, CA

Center for Magnetic Recording Research



# Rate ≈ 0.82, LDPC codes with SPA Decoding

BER of Different Codes of Rate  $\approx 0.8$ 



Flash Memory Summit 2014 Santa Clara, CA

Center for Magnetic Recording Research



## Rate $\approx$ 0.9, LDPC codes with SPA Decoding \*

BER of Different Codes of Rate ~0.9  $10^{-2}$  $10^{-3}$ 10 BER 10 10<sup>-6</sup> RAW BER 10 BCH (R=0.9) Gallager (R=0.9) DJCM-3 (R=0.87) DJCM-4 (R=0.87) 10<sup>-6</sup> 2000 4000 6000 8000 10000 n Program/Erase Cycle

\*Yaakobi, et al., Proc. Int. Conf. on Comp., Network. Commun. (ICNC), Jan.-Feb. 2012.





# Rate ≈ 0.925, LDPC codes with SPA Decoding



\*Yaakobi, et al., *Proc. Int. Conf.* on Comp., Network. Commun. (ICNC), Jan.-Feb. 2012.





### Flash Memory Rate ≈ 0.8, MSA v/s SPA Decoding







## Other Decoding Techniques

- Linear Programming (LP) decoding of LDPC codes
  - Introduced by Feldman in 2003.
  - LP decoding with Alternating Direction Method of Multipliers (ADMM) proposed by Barman, et al., in 2011 to speed up LP decoding
  - A more efficient scheme based upon Adaptive LP decoding (ALP) with fast Cut-Search Algorithm (CSA) to further speed up the key "projection step" in LP-ADMM.<sup>1</sup>
- SPA with (q+1)-bit Quasi-uniform Quantization <sup>2</sup>

<sup>1</sup>X. Zhang and P. H. Siegel, *Proc. Int. Symp. on Info. Theory (ISIT),* July. 2013.

<sup>2</sup>X. Zhang and P. H. Siegel, *Proc. IEEE SPCOM*, July. 2012.





- M4376: MacKay code, length 4376 and rate 0.9356
- DJCM-4: MacKay code, length 3200 and rate 0.93
- LP: ADMM-based LP decoder, max iterations 200
- ft-SPA: floating-point SPA
- Quantized SPA: (5+1)-bit quasi-uniform quantized
  SPA





## Rate $\approx$ 0.925 LP vs. SPA Decoding on TLC





- Best LDPC performance surpasses BCH at all code rates R≈ 0.8, 0.9, 0.925.
- R≈0.8 LDPC codes at 15k cycles has BER comparable to R≈0.9 LDPC codes at 10k cycles.
- MSA was inferior to SPA decoding at R≈0.8.
- LP-ADMM was comparable to SPA decoding at R≈0.925, with slightly steeper slope.
- (5+1)-bit quasi-uniform quantized SPA (not optimized) matches floating-point SPA.





- Based on the phenomenon of channel polarization
- Achieve the capacity of symmetric binary input discrete memoryless channels under successive cancellation (SC) decoding \*
- CRC concatenated Polar codes with SC-List decoding can beat LDPC code performance (Tal and Vardy, 2011)

\*E. Arikan, IEEE Trans. on Info. Theory, July. 2009.





## LP Decoding of Polar Codes

- High density parity check codes
- Goela et al. in 2010 proposed a sparse LP polytope representation which works for a binary erasure channel but doesn't work well for a binary AWGN channel.
- We present results using a modified adaptive LP decoder for polar codes and a lower complexity polytope representation based on reduction of the polar code sparse factor graph.\*

\*V. Taranalli and P. H. Siegel, Proc. Int. Symp. on Info. Theory (ISIT), June-July. 2014.





# (128, 64) Polar Code over a BAWGN channel





30



- Characterization of inter-cell interference (ICI) and design of codes to combat it, e.g., ECC + constrained codes.
- Optimization of ECC schemes for better flash memory channel models e.g., asymmetric channels, nonbinary channels, time-varying channels.
- Adaptation of new ECC techniques (polar codes, spatially-coupled LDPC codes) to flash memory applications.





## Thank You!

