

# Increase Controller Performance & Energy Efficiency

Without sacrificing programmability

Chris Rowen

Founder, CTO Tensilica



### Who is Tensilica?

Leading supplier of processor cores and SW for "data-plane"



#### **Business Model – Semiconductor IP licensing**

- Processor IP for the data plane
  - Deeply embedded control, DSP, application-specific accelerators
- Shipping in over 20 application areas
  - Storage, Audio, Baseband, Printers, Cameras, Network infrastructure/access...
- Licensed by several major flash controller companies



#### **Market**

- Nearly 2 Billion cores shipped!
  - Run-rate approaching 1B cores/yr
- 190+ Licensees worldwide
  - By 8 of the top 12 semiconductor manufacturers
  - In 7 of the top 12 Smartphone manufacturers' products



#### **Company Facts**

- Privately held. Venture backed. Profitable, cash positive for many years.
- Headquarters and major operations in Santa Clara, CA
- Sales offices worldwide (US, UK, Japan, Korea, China, Taiwan)



Your current design has one or more processor cores...



...you need 2x more processing in the next design, but energy consumption & programmability are also important.

What are your options...



Eg: 2x faster in next design – what are the options?

Run the core at 2x clock frequency



May not be possible

**Benefits** 

Easy development

#### Costs

Lower energy efficiency

Pushing process limits results in a proportionally larger & higher power core.



Eg: 2x faster in next design – what are the options?

Add more cores



#### Benefits

Manageable hardware changes.

Familiar development environment.

#### Costs

Software partitioning work.

Coherency management in Hardware and/or Software

Similar energy efficiency
A little worse from management overhead



Eg: 2x faster in next design – what are the options?

Offload bottlenecks with RTL



#### Benefits

Higher energy & area efficiency.

Small software changes.

#### Costs

RTL development and significant verification.

Lose programmability in RTL state machine.



Eg: 2x faster in next design – what are the options?

Offload bottlenecks with Xtensa using TIE

Xtensa is configurable.

Do not include instructions / functions that are never used



#### **Benefits**

Higher energy & area efficiency.

Dramatically less verification.

Small software changes.

#### Costs

Modest TIE hardware development.



Eg: 2x faster in next design – what are the options?

Directly interface to accelerators for faster I/O



Up to 1024 bits wide. Multiple connections. GPIO, FIFO, Memory

#### Benefits

Multi high bandwidth interfaces. Up to 1024 bits each, simultaneously.

Avoids system bus.

No arbitration, frees up bandwidth.

Predictable latency.

#### Costs

Add simple TIE instructions to define interfaces.

Small software changes.
Instruction controlled interface rather than memory mapped.



Eg: 2x faster in next design

## Summary of development options

| Option                        |                            | Δ Size | ∆ Energy<br>Efficiency | ∆ Software                          | ∆ Hardware                                 |
|-------------------------------|----------------------------|--------|------------------------|-------------------------------------|--------------------------------------------|
| Core                          | Current<br>design          | -      | -                      | -                                   | -                                          |
| Core                          | 2x MHz<br>If possible      | ~<2x   | <1x                    | 0                                   | 0                                          |
| Core + Core                   | Multiple<br>Cores          | ~2x    | ~1x                    | Large<br>Coherence,<br>Partitioning | Small Resource arbitration                 |
| With identifiable bottlenecks |                            |        |                        |                                     |                                            |
| Core + RTL                    | Offload<br>Cycle reduction | <<2x   | >>1x                   | Small<br>Interfacing                | Very Large<br>RTL Design +<br>Verification |
| Core                          | Xtensa<br>Cycle reduction  | <<2x   | >>1x                   | Very Small Add intrinsics           | Small <sup>1</sup> TIE Design              |

<sup>&</sup>lt;sup>1</sup> Typically small, can scale with desired performance improvement



... with programmability in your flash controller





## Flash Controller Offload

## Summary of real Xtensa examples





## Flash Memory For more information

## Find your regional contact online: <a href="https://www.Tensilica.com">www.Tensilica.com</a>

**Email** 

storage.info@tensilica.com