



# LDPC Compiler For NAND Flash and SSD Controllers

# Nenad Miladinović, PhD Proton Digital Systems



 LDPC-Based Read Channel provides significant (10x-20x) improvement in NAND Flash longevity.



Flash Memory Summit 2012 Santa Clara, CA





- LDPC Compiler supporting a wide range of data-rates
  - 50MB/s to 3.5GB/s for a single LDPC instance (in 40nm process)
- List of parameters selected prior to instantiation:
  - Codeword size (Macro-level: 1KB vs. 0.5KB vs. 2KB, etc.)
  - Several parameters for degree of parallelism and memory access options
- After compilation, each instance is supporting:
  - Simultaneous support for different amounts of parity/code rate
  - Simultaneous support for several LDPC codes
  - On-the-fly switching from one LDPC code to another
  - Each matrix can be an arbitrary LDPC matrix subject to certain constraints





# Memory LDPC Decoder Core Examples

- Codeword size is 1KB
- Total power is measured for TT, 0.9V, 25C
- TSMC 40G process

| Decoder<br>Throughput | Clock<br>Frequency | LDPC Compiler<br>Options | Gate Count<br>(KG) | Memory<br>Size | Total Power<br>(Gates+Memory+leakage) |             |
|-----------------------|--------------------|--------------------------|--------------------|----------------|---------------------------------------|-------------|
|                       |                    |                          |                    |                | Beginning-of-<br>Life                 | End-of-Life |
| 800 MB/s              | 450 MHZ            | Option set 4<br>CW=1KB   | 158.6              | 17.9KB         | 46mW                                  | 138mW       |
| 111 MB/s              | 250 MHZ            | Option set 2<br>CW=1KB   | 36.6               | 17.9KB         | 5mW                                   | 14mW        |





LDPC Decoder cores (compiler output) examples for ASIC implementation under various conditions.

| LDPC Compiler<br>Options | Technology Library<br>TSMC 40nm G<br>HVT ONLY<br>Track | Frequency<br>(MHz) | Throughput<br>(MByte/s) | Gate Count<br>(KG) | Memory<br>(KByte) |
|--------------------------|--------------------------------------------------------|--------------------|-------------------------|--------------------|-------------------|
| Option set 1, CW=1KB     | 9Т                                                     | 250                | 111                     | 36.6*              | 17.9              |
| Option set 2, CW=1KB     | 9Т                                                     | 400                | 222                     | 46.2*              | 17.9              |
| Option set 3, CW=1KB     | 9Т                                                     | 600                | 534                     | 70.4*              | 20.5              |
| Option set 4, CW=1KB     | 9Т                                                     | 500                | 895                     | 158.6*             | 17.9              |
| Option set 5, CW=1KB 9T  |                                                        | 500                | 1780                    | 301.4*             | 20.5              |
| Option set 5 ,CW=1KB 12T |                                                        | 1000               | 3560                    | 391.8*             | 20.5              |

#### \* Gate count is measured based on two input NAND gate

Flash Memory Summit 2012 Santa Clara, CA





- ASIC, eASIC and FPGA implementation and integration are supported
- ASIC implementation
  - Cadence Design Flow
  - TSMC libraries
  - Trial place and route at IP/Block level
- eASIC implementation
  - LDPC Compiler is run with a custom option set for eASIC
  - Full integration with eASIC design flow, design implemented with clock frequency up to 500MHz.
- FPGA implementation
  - LDPC Compiler is run with a custom option set for FPGA



#### Sufficient Iterations for End-of-Life emorv

LDPC Decoder computational load and power consumption increase towards the End-of-Life of SSD:



- LDPC Compiler guarantees sufficient iterations for End-of-Life
  - Guaranteed sustained 3.5 LDPC iterations for quoted data-rates
  - Maximum iteration limit is programmable and is typically much higher (e.g. 8-128)



- Full-power LDPC decoder is used for conventional read ("harddecision decoding")
- This reduces the occurrence rate of soft-information read
- Example of LDPC Correction Capability:

| Method                      | User<br>Bytes | Parity<br>Bytes | Average bit errors<br>corrected                      |
|-----------------------------|---------------|-----------------|------------------------------------------------------|
| ВСН, Т=70                   | 1KB           | 123             | 70                                                   |
| LDPC<br>Hard-Input Decoding | 1KB           | 123             | 73                                                   |
| LDPC<br>Soft-Input Decoding | 1KB           | 123             | >186 (=ER @ optimal Vth)<br>>490 (=ER @ nominal Vth) |



- Significant testing and system optimization required for full Flash Read Channel Solution – LDPC is only a component
- Read Channel testing on various Flash Geometries: 2X/2Ynm, 1Xnm
  - "Special Commands" from different Flash manufacturers





- Testing on full manufacturing yield distribution
  - Flash samples from production line
  - "Bad Samples" from production line

