

#### Reliability, Availability, Serviceability (RAS) and Management for Non-Volatile Memory Storage

Mohan J. Kumar, Intel Corp Sammy Nachimuthu, Intel Corp Dimitris Ziakas, Intel Corp

Santa Clara, CA August 2015





- NVDIMM Types
- Storage vs. Memory Characteristics
- Storage vs. NVDIMM Characteristics
- NVDIMM expectation for storage use
- Storage RAS vs. Memory RAS
- Storage RAS vs. NVDIMM RAS
- Storage Management vs. Memory Management (DRAM)
- Storage Management vs. NVDIMM Management
- Summary



# **NVDIMM** Types

#### NVDIMM-N

#### **NVDIMM-F**

#### **NVDIMM-P**



- Only DRAM is addressable by SW
- NV Media acts as backup for DRAM
- NV Media not addressable
- At least 1:1 Capacity Ratio between DRAM & NV Media
- Tracks DRAM latency & memory channel BW for Read and Write



- No DRAM
- NV Media is directly addressable via Window mechanism
- Tracks NV Media latency
- Benefits from memory channel bandwidth



- Combination of NVDIMM-N and NVDIMM-F
- Flash memory beyond that needed for persistence is accessible as block

Santa Clara, CA August 2015



### **Storage vs. Memory Characteristics**

| Attribute                           | Storage (e.g. PCle)                                 | Memory (e.g. DRAM)                                      |
|-------------------------------------|-----------------------------------------------------|---------------------------------------------------------|
| Unit of Access                      | Block                                               | Cacheline                                               |
| Latency for unit access             | ~ uS to ms                                          | ~ of 10s of ns                                          |
| Bandwidth                           | IO Channel Bandwidth<br>(e.g. PCIe Gen3 x4 – 4GB/s) | Memory Channel Bandwidth (e.g.<br>DDR4 2133 – 17.1GB/s) |
| Interleave for higher perf using HW | No                                                  | Yes                                                     |
| Data Access                         | Controller Mediated                                 | Not Controller Mediated                                 |
| Application Access (Typical)        | Kernel Mediated                                     | No Kernel mediation                                     |
| Access (Typical)                    | DMA                                                 | Direct Access                                           |
| Expandability                       | Yes (e.g. via switches)                             | No                                                      |
| <sub>s</sub> Attach                 | IO Channel (e.g. PCIe)                              | Memory Channel                                          |
| August 2015                         |                                                     |                                                         |

# Storage vs. NVDIMM Characteristics

| Attribute                              | Storage (e.g. PCle)                                 | NVDIMM-N                                                | NVDIMM-F                   |
|----------------------------------------|-----------------------------------------------------|---------------------------------------------------------|----------------------------|
| Unit of Access                         | Block                                               | cacheline                                               | Block                      |
| Latency for unit<br>access             | ~ uS to ms                                          | ~ of 10s of ns                                          | ~ uS                       |
| Bandwidth                              | IO Channel Bandwidth<br>(e.g. PCIe Gen3 x4 – 4GB/s) | Memory Channel Bandwidth<br>(e.g. DDR4 2133 – 17.1GB/s) | ~ Memory Channel Bandwidth |
| Interleave for higher<br>perf using HW | No                                                  | Yes                                                     | No                         |
| Data Access                            | Controller Mediated                                 | Not Controller Mediated                                 | Controller Mediated        |
| Application Access<br>(Typical)        | Kernel Mediated                                     | No Kernel mediation                                     | Kernel mediated            |
| Access (Typical)                       | DMA                                                 | Direct Access                                           | Direct access              |
| Channel Expandability                  | Yes (e.g. via switches)                             | No                                                      | No                         |
| Attach                                 | IO Channel (e.g. PCle)                              | Memory Channel                                          | Memory Channel             |



- Data errors do not bring down the system
- Access to health information at boot and runtime
- Serviceable
- Data at rest security
- Ease of Migration (in spite of interleave)

\*high-end systems may support memory error recovery \*\*Post Package Repair



August 2010

#### Storage vs. Memory Characteristics RAS

| Attribute                                             | Storage (e.g. PCle)                     | Memory (e.g. DRAM)                                                  |
|-------------------------------------------------------|-----------------------------------------|---------------------------------------------------------------------|
| Write Durability                                      | Yes                                     | NA (Volatile Media, not storage)                                    |
| Impact of Error                                       | Application wide                        | System wide*                                                        |
| Wear Level Management                                 | Yes                                     | No                                                                  |
| Platform Single Point of<br>Failure (SPOF) Protection | Multi-ported storage                    | No                                                                  |
| Data Protection                                       | Device and Controller level (ECC, RAID) | Device and Controller Level<br>(ECC, PPR**, Platform Memory<br>RAS) |
| Error Detection Granularity                           | Block                                   | Cacheline                                                           |
| Write Cycle Limit                                     | Yes                                     | No                                                                  |



- Data Protection (device and software level)
- RAID Support various RAID levels, Software vs. HW RAID
- Hot add/remove of Storage Device
- Health and Predictive Failure Reporting SMART
- Multi-port capability (to overcome Platform as SPOF)



- ECC protected Memory
- SDDC (Single Device Data Correction)
- Data Scrub Patrol, On Demand
- DIMM Sparing
- Memory Mirroring
- Poison
- Memory error recovery using MCA Recovery
- Hot add of Memory
- Memory Migration

\*processor may access more than a byte \*\*high-end systems may support memory error recovery



## Storage vs. NVDIMM-N RAS

| Attribute                   | Storage (e.g. PCle)                     | NVDIMM-N                                                                            |
|-----------------------------|-----------------------------------------|-------------------------------------------------------------------------------------|
| Write Durability            | Yes                                     | Possible with ADR or PCOMMIT                                                        |
| Impact of Error             | Application wide                        | System wide**                                                                       |
| Wear Level Management       | Yes                                     | NVDIMM controller<br>implemented wear levelling                                     |
| Platform SPOF Protection    | Multi-ported storage                    | No                                                                                  |
| Data Protection             | Device and Controller level (ECC, RAID) | Device and Controller Level<br>(ECC, PPR, benefits from all<br>Platform Memory RAS) |
| Error Detection Granularity | Block                                   | Cacheline                                                                           |
| Write Cycle Limit           | Yes                                     | No (Backing media only written on system fail conditions)                           |



\*processor may access more than a byte \*\*high-end systems may support memory error recovery

# Storage vs. NVDIMM-F RAS

| Attribute                   | Storage (e.g. PCle)                     | NVDIMM-F                                                            |
|-----------------------------|-----------------------------------------|---------------------------------------------------------------------|
| Write Durability            | Yes                                     | Yes (by storage controller).<br>Also benefits from PCOMMIT          |
| Impact of Error             | Application wide                        | System wide**                                                       |
| Wear Level Management       | Yes                                     | NVDIMM controller implemented wear levelling                        |
| Platform SPOF Protection    | Multi-ported storage                    | No                                                                  |
| Data Protection             | Device and Controller level (ECC, RAID) | NVDIMM controller Level (does not benefit from Platform Memory RAS) |
| Error Detection Granularity | Block                                   | Cacheline                                                           |
| Write Cycle Limit           | Yes                                     | Yes                                                                 |



### Storage vs. Memory Management

| Attribute                  | Storage (e.g. PCIe)                        | Memory (e.g. DRAM)              |
|----------------------------|--------------------------------------------|---------------------------------|
| Data at Rest Security      | Capable of support                         | No                              |
| Health                     | SMART                                      | NA (Platform test at each boot) |
| Serviceability             | Typically does not require opening chassis | Requires opening the chassis    |
| Migration across Platforms | Easy                                       | Easy (Volatile Media)           |



- Management integrated in the Storage Controller
- Exposed to software via SMART
- Out of band management interface is proprietary
- Data at rest security built-in
- Partition management and partition access protection implemented by storage controller
- Wear level management is required



### **Memory Management**

- Typically, Platform Management software handles memory predictive failure
- OS-based memory predictive failure management at an address space level
- Otherwise, Memory Not managed entity
- Data at rest security does not apply (volatile memory)
- No need for wear level management

**Traditional View** 



| Attribute                                      | Storage (e.g. PCle)                        | NVDIMM-N                                                            |
|------------------------------------------------|--------------------------------------------|---------------------------------------------------------------------|
| Data at Rest Security                          | Capable of support                         | No (controller mediation not possible since fronting media is DRAM) |
| Health                                         | SMART                                      | SMART                                                               |
| Serviceability                                 | Typically does not require opening chassis | Requires opening the chassis                                        |
| Energy Source Management                       | NA                                         | Required                                                            |
| Relative ease of Migration<br>across Platforms | Good                                       | Difficult* (platform interleave impact)                             |



| Attribute                                      | Storage (e.g. PCle)                        | NVDIMM-F                                                |
|------------------------------------------------|--------------------------------------------|---------------------------------------------------------|
| Data at Rest Security                          | Capable of support                         | Yes                                                     |
| Health                                         | SMART                                      | SMART                                                   |
| Serviceability                                 | Typically does not require opening chassis | Requires opening the chassis                            |
| Energy Source Management                       | NA                                         | NA                                                      |
| Relative ease of Migration<br>across Platforms | Good                                       | Medium* (platform interleave impact could be mitigated) |



#### **NVDIMM Firmware Interface Table (NFIT)**





•

Controller

NVDIMM-F 2

Ch1

- NFIT describes interleave across the channel to software
- NVDIMM-F control and data regions mapped in to system physical address (SPA) space
- Control and Data Regions exposed to drive via
  NFIT Tables
- NVDIMM-F also has SMBus control region for FW upgrades ...

Data

Region

Control

Region NVDIMM-F

**Device Space** 

**SMBus** 

**Control Region** 



## **NVDIMM Management**



Exposing NVDIMM management to software

- Management support integrated in the NVDIMM Controller
- Exposed to software via SMART
- Data at rest security is feasible for NVDIMM-F
- Wear level management is required for NV Media
  - NVDIMM interleave management is required NVDIMMs depend on BIOS cooperation to access the data on subsequent boot (BIOS has to configure the interleaves and address space mapping identically across boots)
- For NVDIMM-N, platform has to monitor and manage the energy source (battery, supercap)
- Runtime management of NVDIMM exposed to software via ACPI \_DSM (Device Specific Method)



#### Summary

- NVDIMM provides higher performance but requires solving the RAS and manageability to be on par with storage
- NVDIMM enumeration at hardware level is standardized by JEDEC
- Unlike Storage, NVDIMMs heavily dependent on Platform BIOS
- NVDIMM enumeration, configuration is now comprehended in firmware standards – E820, NFIT (Refer to <u>ACPI6.0 Specification</u>)
- Runtime management of NVDIMM and vendor specific extension is abstracted via \_DSM (SMART, FW updates...)