new perspectives for hearing aid hardware...

43
Prof. Dr.-Ing. Holger Blume, 06.06.2017 Seite 1 New Perspectives for Hearing Aid Hardware Design Prof. Dr.-Ing. Holger Blume Jun.-Prof. Dr.-Ing. Guillermo Payá Vayá, Dipl.-Ing. Lukas Gerlach, M.Sc. Christopher Seifert Institut für Mikroelektronische Systeme Leibniz Universität Hannover

Upload: others

Post on 29-Mar-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: New Perspectives for Hearing Aid Hardware Designhadf.hoertech.de/2017/downloads/HADF_Blume_2017.pdf · 2017-06-21 · New Perspectives for Hearing Aid Hardware Design Prof. Dr.-Ing

Prof. Dr.-Ing. Holger Blume, 06.06.2017 Seite 1

New Perspectives for Hearing Aid Hardware Design

Prof. Dr.-Ing. Holger BlumeJun.-Prof. Dr.-Ing. Guillermo Payá Vayá, Dipl.-Ing. Lukas Gerlach, M.Sc. Christopher Seifert

Institut für Mikroelektronische SystemeLeibniz Universität Hannover

Page 2: New Perspectives for Hearing Aid Hardware Designhadf.hoertech.de/2017/downloads/HADF_Blume_2017.pdf · 2017-06-21 · New Perspectives for Hearing Aid Hardware Design Prof. Dr.-Ing

Prof. Dr.-Ing. Holger Blume, 06.06.2017 Seite 2

Motivation Basics of power consumption Concept of an application tailored processor architecture Exemplary results for architecture optimization Remaining challenges New processor design project Smart HeaP Summary

Contents

Page 3: New Perspectives for Hearing Aid Hardware Designhadf.hoertech.de/2017/downloads/HADF_Blume_2017.pdf · 2017-06-21 · New Perspectives for Hearing Aid Hardware Design Prof. Dr.-Ing

Prof. Dr.-Ing. Holger Blume, 06.06.2017 Seite 3

Hearing aid technology requirements Low processing delay: <10 ms Programmability / flexibility Small form factor (higher user

acceptance)

Motivation: Digital Hearing Aid Systems

μP

Performance

Pow

er C

onsu

mpt

ion DSP

Dedic. HW Arch.Custom

Dedic. HW Arch.Semi-Custom

ASIP

Programmability

Hardwarecost

efficient

Microphone

Digital Signal Processor

Battery

Speaker

Design of ASIPs for Digital Hearing Aid Systems

(1 mW, 1 mm2)

Page 4: New Perspectives for Hearing Aid Hardware Designhadf.hoertech.de/2017/downloads/HADF_Blume_2017.pdf · 2017-06-21 · New Perspectives for Hearing Aid Hardware Design Prof. Dr.-Ing

Prof. Dr.-Ing. Holger Blume, 06.06.2017 Seite 4

Different Influencing Factors on Power Consumption

Architecture optimization Custom instructions Co-processor

Page 5: New Perspectives for Hearing Aid Hardware Designhadf.hoertech.de/2017/downloads/HADF_Blume_2017.pdf · 2017-06-21 · New Perspectives for Hearing Aid Hardware Design Prof. Dr.-Ing

Prof. Dr.-Ing. Holger Blume, 06.06.2017 Seite 5

Total Power Consumption

ArchitectureDesign Circuit technology

TechnologyCircuit technologySupply voltages

2i i i i i SC DC

i

P σ f C U W P

C

DDU

outUinUn

DDU

1

50,a

""1""1d ""0 ""1500 ,

Glitches

1 f

Page 6: New Perspectives for Hearing Aid Hardware Designhadf.hoertech.de/2017/downloads/HADF_Blume_2017.pdf · 2017-06-21 · New Perspectives for Hearing Aid Hardware Design Prof. Dr.-Ing

Prof. Dr.-Ing. Holger Blume, 06.06.2017 Seite 6

Total Power Consumption

ArchitectureDesign Circuit technology

TechnologyCircuit technologySupply voltages

2i i i i i S

iC DCWP σ Pf C U

C

DDU

outUinU n

DDU1

50,a

""1""1d ""0 ""1500 ,

Glitches

1 f

Page 7: New Perspectives for Hearing Aid Hardware Designhadf.hoertech.de/2017/downloads/HADF_Blume_2017.pdf · 2017-06-21 · New Perspectives for Hearing Aid Hardware Design Prof. Dr.-Ing

Prof. Dr.-Ing. Holger Blume, 06.06.2017 Seite 7

Low-Power Hearing Aid ASIPs

Real-Time Processing Constraints (tc)

time

Real-Time Processing Constraints (tc)

time

RISC Processor

aA

ca t

Nf

2, UCfP aaadyn

ab AA 2

ab ff 41

2, 2

1 UCfP aabdyn

RISC Processor

+ Custom FUs

Page 8: New Perspectives for Hearing Aid Hardware Designhadf.hoertech.de/2017/downloads/HADF_Blume_2017.pdf · 2017-06-21 · New Perspectives for Hearing Aid Hardware Design Prof. Dr.-Ing

Prof. Dr.-Ing. Holger Blume, 06.06.2017 Seite 8

Baseline Architecture

ASIP: Application-Specific Instruction-Set ProcessorInstruction Memory / Cache PC

Instruction Decoder

Register FileNumber of Ports and Registers

Issue 0

IF/DE

DE/RA

RA/EX1

ALU

Data Memory / Cache

Page 9: New Perspectives for Hearing Aid Hardware Designhadf.hoertech.de/2017/downloads/HADF_Blume_2017.pdf · 2017-06-21 · New Perspectives for Hearing Aid Hardware Design Prof. Dr.-Ing

Prof. Dr.-Ing. Holger Blume, 06.06.2017 Seite 9

Instruction Memory / Cache PC

Instruction Decoder

Register FileNumber of Ports and Registers

Issue 0

IF/DE

DE/RA

RA/EX1

ALU

Data Memory / Cache

Instruction Memory / Cache PC

Instruction Decoder

Register FileNumber of Ports and Registers

Issue 0

IF/DE

DE/RA

RA/EX1

ALU

Data Memory / Cache

EX1/WB

Instruction Memory / Cache PC

Instruction Decoder

Register FileNumber of Ports and Registers

Issue 0

IF/DE

DE/RA

RA/EX1

ALU

Data Memory / Cache

EX1/WB

Instruction Memory / Cache PC

Instruction Decoder

Register FileNumber of Ports and Registers

Issue 0

IF/DE

DE/RA

RA/EX1

ALU

Data Memory / CacheMAC

EX1/WB

Instruction Memory / Cache PC

Instruction Decoder

Register FileNumber of Ports and Registers

Issue 0

IF/DE

DE/RA

RA/EX1

ALU

Data Memory / CacheMAC

Instruction Decoder

Issue 1

Instruction Decoder

Issue 2

EX1/WB

Instruction Memory / Cache PC

Instruction Decoder

Register FileNumber of Ports and Registers

Issue 0

IF/DE

DE/RA

RA/EX1

ALUSIMD

Data Memory / CacheMAC

Instruction Decoder

Issue 1

Instruction Decoder

Issue 2

EX1/WB

Baseline Architecture Basic Architecture Parameters

Register File Configuration Memory System Instruction-Set Architecture

Parallelization Techniques Number of Parallel Instructions SIMD / Subword Parallelism

Specialization Techniques Custom Instructions Co-processor Architectures

Compiler / Software Support

ASIP: Application-Specific Instruction-Set ProcessorInstruction Memory / Cache PC

Instruction Decoder

Register FileNumber of Ports and Registers

Issue 0

IF/DE

DE/RA

RA/EX1

ALUSIMD

Data Memory / CacheMAC

Instruction Decoder

Issue 1

Instruction Decoder

Issue 2

SFUSFU

EX1/WB

Co-Processor

Page 10: New Perspectives for Hearing Aid Hardware Designhadf.hoertech.de/2017/downloads/HADF_Blume_2017.pdf · 2017-06-21 · New Perspectives for Hearing Aid Hardware Design Prof. Dr.-Ing

Prof. Dr.-Ing. Holger Blume, 06.06.2017 Seite 10

Baseline Architecture Reduced 32-bit ISA, 5 pipeline stages 16 KB Instruction Cache and

16 KB Memory Cache Configurable

Caches, bus width, GP register file, MUL, MAC, INT, number of load/store units

Expandable New instruction, register, ports Using TIE language (similar to Verilog)

Area and energy optimization are possible

Xtensa Customizable Processor / Cadence

[www.cadence.com]

Page 11: New Perspectives for Hearing Aid Hardware Designhadf.hoertech.de/2017/downloads/HADF_Blume_2017.pdf · 2017-06-21 · New Perspectives for Hearing Aid Hardware Design Prof. Dr.-Ing

Prof. Dr.-Ing. Holger Blume, 06.06.2017 Seite 11

Using TIE language (Tensilica Instruction Extension) Custom instructions, registers and interfaces Can be used in the C program code SIMD-example: 4x 16bit additions

Extension of the Xtensa Processor with hardware units

regfile simd64 64 16 v // 16 x 64bit Registeroperation vec4_add16 {out simd64 sum, in simd64 A, in simd64 B} {} {

wire [15:0] result0 = (A[15: 0] + B[15: 0]);wire [15:0] result1 = (A[31:16] + B[31:16]);wire [15:0] result2 = (A[47:32] + B[47:32]);wire [15:0] result3 = (A[63:48] + B[63:48]);assign sum = {result3, result2, result1, result0}; }

#include <xtensa/tie/vec4_add16.h>simd64 A[VECLEN];simd64 B[VECLEN];simd64 sum[VECLEN];for (i=0; i<VECLEN; i++)

sum[i] = vec4_add16(A[i],B[i]);

vec4_add16.tie

use_vec4_add16.c

Definition of a custom instruction

Using the custom instruction in C

Page 12: New Perspectives for Hearing Aid Hardware Designhadf.hoertech.de/2017/downloads/HADF_Blume_2017.pdf · 2017-06-21 · New Perspectives for Hearing Aid Hardware Design Prof. Dr.-Ing

Prof. Dr.-Ing. Holger Blume, 06.06.2017 Seite 12

Configuration Implemented for the HA System Baseline (1-Issue-Slot) Baseline (2-Issue-Slots) Baseline (3-Issue-Slots)

Customized (1-Issue-Slot)

Customized (2-Issue-Slots)

Xtensa Customizable Processor / Cadence

Exploring specialization

Exploring parallelism and specialization

Exploring parallelism

[www.tensilica.com]

Page 13: New Perspectives for Hearing Aid Hardware Designhadf.hoertech.de/2017/downloads/HADF_Blume_2017.pdf · 2017-06-21 · New Perspectives for Hearing Aid Hardware Design Prof. Dr.-Ing

Prof. Dr.-Ing. Holger Blume, 06.06.2017 Seite 13

0

5000

10000

15000

20000

Analysis Filterbank Noise Reduction Amplification Synthesis Filterbank

Tota

l num

ber o

f cy

cles

per

Aud

io

Buff

er

Customized Configuration:Complex Instruction Extensions

50% FFT 50% IFFT

65% SQRTOverlap+Add Overlap+Add

Page 14: New Perspectives for Hearing Aid Hardware Designhadf.hoertech.de/2017/downloads/HADF_Blume_2017.pdf · 2017-06-21 · New Perspectives for Hearing Aid Hardware Design Prof. Dr.-Ing

Prof. Dr.-Ing. Holger Blume, 06.06.2017 Seite 14

Customized Configuration:Complex Instruction Extensions New Register File for SIMD-Op.

0

5000

10000

15000

20000

Analysis Filterbank Noise Reduction Amplification Synthesis Filterbank

Tota

l num

ber o

f cy

cles

per

Aud

io

Buff

er

65% SQRT

50% FFT 50% IFFT

Overlap+Add Overlap+Add

Page 15: New Perspectives for Hearing Aid Hardware Designhadf.hoertech.de/2017/downloads/HADF_Blume_2017.pdf · 2017-06-21 · New Perspectives for Hearing Aid Hardware Design Prof. Dr.-Ing

Prof. Dr.-Ing. Holger Blume, 06.06.2017 Seite 15

New Register File for SIMD-Op. COMPLEX ARITHMETIC Op.

R0 = COMPLEX_ADD(R1,R2) R0 = COMPLEX_MUL(R1,R2) R0 = COMPLEX_CONJ(R1) AR0 = BIT_REVERSE(AR1)

Customized Configuration:Complex Instruction Extensions

R0R1R2R3

ADD ADD

COMPLEX_ADD

A.real B.real A.img B.img

C.real C.img

A.real

B.Real

C.real

A.img

B.img

C.img

Register File (each Register 64-Bits) R0

Register File (each Register 64-Bits)R0R1R2R3

MUL MUL

COMPLEX_MUL

MUL MUL

A.real B.real A.real B.img A.img B.real A.img B.img

SUB ADD

C.real C.img

A.real

B.Real

C.real

A.img

B.img

C.img

R0

0

5000

10000

15000

20000

Analysis Filterbank Noise Reduction Amplification Synthesis Filterbank

Tota

l num

ber o

f cy

cles

per

Aud

io

Buff

er

65% SQRT

50% FFT 50% IFFT

Overlap+Add Overlap+Add

Page 16: New Perspectives for Hearing Aid Hardware Designhadf.hoertech.de/2017/downloads/HADF_Blume_2017.pdf · 2017-06-21 · New Perspectives for Hearing Aid Hardware Design Prof. Dr.-Ing

Prof. Dr.-Ing. Holger Blume, 06.06.2017 Seite 16

New Register File for SIMD-Op. COMPLEX ARITHMETIC Op.

R0 = COMPLEX_ADD(R1,R2) R0 = COMPLEX_MUL(R1,R2) R0 = COMPLEX_CONJ(R1) AR0 = BIT_REVERSE(AR1)

Customized Configuration:Complex Instruction Extensions

R0R1R2R3

ADD ADD

COMPLEX_ADD

A.real B.real A.img B.img

C.real C.img

A.real

B.Real

C.real

A.img

B.img

C.img

Register File (each Register 64-Bits) R0

Register File (each Register 64-Bits)R0R1R2R3

MUL MUL

COMPLEX_MUL

MUL MUL

A.real B.real A.real B.img A.img B.real A.img B.img

SUB ADD

C.real C.img

A.real

B.Real

C.real

A.img

B.img

C.img

R0

0

5000

10000

15000

20000

Analysis Filterbank Noise Reduction Amplification Synthesis Filterbank

Tota

l num

ber o

f cy

cles

per

Aud

io

Buff

er

x2.2 reduction

x2.2 reduction

65% SQRT

Page 17: New Perspectives for Hearing Aid Hardware Designhadf.hoertech.de/2017/downloads/HADF_Blume_2017.pdf · 2017-06-21 · New Perspectives for Hearing Aid Hardware Design Prof. Dr.-Ing

Prof. Dr.-Ing. Holger Blume, 06.06.2017 Seite 17

0

5000

10000

15000

20000

Analysis Filterbank Noise Reduction Amplification Synthesis Filterbank

Tota

l num

ber o

f cy

cles

per

Aud

io

Buff

er

0

5000

10000

15000

20000

Analysis Filterbank Noise Reduction Amplification Synthesis Filterbank

Tota

l num

ber o

f cy

cles

per

Aud

io

Buff

er SQRT Operations LEADING_ONES(R0) R0 = SQUARE_ROOT(R1) R0 = THRESHOLD(R0,R1,R2)

Customized Configuration:Complex Instruction Extensions

x16 reduction

65% SQRT

Newton-Raphsonmethod for square root computation

Page 18: New Perspectives for Hearing Aid Hardware Designhadf.hoertech.de/2017/downloads/HADF_Blume_2017.pdf · 2017-06-21 · New Perspectives for Hearing Aid Hardware Design Prof. Dr.-Ing

Prof. Dr.-Ing. Holger Blume, 06.06.2017 Seite 18

Exemplary Hearing Aid Processing - ASIP Design Space Exploration

0

1

2

3

4

5

6

7

0 10 20 30 40 50 60 70

Esti

mat

ed T

otal

Pow

er

Cons

umpt

ion

(mW

)

Clock Frequency (MHz)

Baseline (1-Issue-Slot)

Baseline (2-Issue-Slots)

Baseline (3-Issue-Slots)

Customized (1-Issue-Slot)

Customized (2-Issue-Slots)

0

1

2

3

4

5

6

7

0 10 20 30 40 50 60 70

Esti

mat

ed T

otal

Pow

er

Cons

umpt

ion

(mW

)

Clock Frequency (MHz)

Baseline (1-Issue-Slot)

Baseline (2-Issue-Slots)

Baseline (3-Issue-Slots)

Customized (1-Issue-Slot)

Customized (2-Issue-Slots)

0

1

2

3

4

5

6

7

0 10 20 30 40 50 60 70

Esti

mat

ed T

otal

Pow

er

Cons

umpt

ion

(mW

)

Clock Frequency (MHz)

Baseline (1-Issue-Slot)

Baseline (2-Issue-Slots)

Baseline (3-Issue-Slots)

Customized (1-Issue-Slot)

Customized (2-Issue-Slots)

0

1

2

3

4

5

6

7

0 10 20 30 40 50 60 70

Esti

mat

ed T

otal

Pow

er

Cons

umpt

ion

(mW

)

Clock Frequency (MHz)

Baseline (1-Issue-Slot)

Baseline (2-Issue-Slots)

Baseline (3-Issue-Slots)

Customized (1-Issue-Slot)

Customized (2-Issue-Slots)

0

0,1

0,2

0,3

0,4

0,5

0,6

0,7

Baseline (1-Issue-Slot)

Baseline (2-Issue-Slots)

Baseline (3-Issue-Slots)

Customized (1-Issue-Slot)

Customized (2-Issue-Slots)

Esti

mat

ed S

ilico

n A

rea

(mm

2 )

Customized Operations

Customized RF

Core

Memory

0

0,1

0,2

0,3

0,4

0,5

0,6

0,7

Baseline (1-Issue-Slot)

Baseline (2-Issue-Slots)

Baseline (3-Issue-Slots)

Customized (1-Issue-Slot)

Customized (2-Issue-Slots)

Esti

mat

ed S

ilico

n A

rea

(mm

2 )

Customized Operations

Customized RF

Core

Memory

0

0,1

0,2

0,3

0,4

0,5

0,6

0,7

Baseline (1-Issue-Slot)

Baseline (2-Issue-Slots)

Baseline (3-Issue-Slots)

Customized (1-Issue-Slot)

Customized (2-Issue-Slots)

Esti

mat

ed S

ilico

n A

rea

(mm

2 )

Customized Operations

Customized RF

Core

Memory

0

0,1

0,2

0,3

0,4

0,5

0,6

0,7

Baseline (1-Issue-Slot)

Baseline (2-Issue-Slots)

Baseline (3-Issue-Slots)

Customized (1-Issue-Slot)

Customized (2-Issue-Slots)

Esti

mat

ed S

ilico

n A

rea

(mm

2 )

Customized Operations

Customized RF

Core

Memory

x1.2

0

0,1

0,2

0,3

0,4

0,5

0,6

0,7

Baseline (1-Issue-Slot)

Baseline (2-Issue-Slots)

Baseline (3-Issue-Slots)

Customized (1-Issue-Slot)

Customized (2-Issue-Slots)

Esti

mat

ed S

ilico

n A

rea

(mm

2 )

Customized Operations

Customized RF

Core

Memory

x1.1 x1.2

These estimations are done for a 40 nm low power technology process

[Werner; Payá Vayá, Blume, “Case Study: Using the Xtensa LX4 Configurable Processor for Hearing Aid Applications”, ICT.OPEN 2013]

Page 19: New Perspectives for Hearing Aid Hardware Designhadf.hoertech.de/2017/downloads/HADF_Blume_2017.pdf · 2017-06-21 · New Perspectives for Hearing Aid Hardware Design Prof. Dr.-Ing

Prof. Dr.-Ing. Holger Blume, 06.06.2017 Seite 19

0

0,1

0,2

0,3

0,4

0,5

0,6

0,7

Baseline (1-Issue-Slot)

Baseline (2-Issue-Slots)

Baseline (3-Issue-Slots)

Customized (1-Issue-Slot)

Customized (2-Issue-Slots)

Esti

mat

ed S

ilico

n A

rea

(mm

2 )

Customized Operations

Customized RF

Core

Memory

0

0,1

0,2

0,3

0,4

0,5

0,6

0,7

Baseline (1-Issue-Slot)

Baseline (2-Issue-Slots)

Baseline (3-Issue-Slots)

Customized (1-Issue-Slot)

Customized (2-Issue-Slots)

Esti

mat

ed S

ilico

n A

rea

(mm

2 )

Customized Operations

Customized RF

Core

Memory

0

0,1

0,2

0,3

0,4

0,5

0,6

0,7

Baseline (1-Issue-Slot)

Baseline (2-Issue-Slots)

Baseline (3-Issue-Slots)

Customized (1-Issue-Slot)

Customized (2-Issue-Slots)

Esti

mat

ed S

ilico

n A

rea

(mm

2 )

Customized Operations

Customized RF

Core

Memory

Exemplary Hearing Aid Processing - ASIP Design Space Exploration

0

1

2

3

4

5

6

7

0 10 20 30 40 50 60 70

Esti

mat

ed T

otal

Pow

er

Cons

umpt

ion

(mW

)

Clock Frequency (MHz)

Baseline (1-Issue-Slot)

Baseline (2-Issue-Slots)

Baseline (3-Issue-Slots)

Customized (1-Issue-Slot)

Customized (2-Issue-Slots)

0

1

2

3

4

5

6

7

0 10 20 30 40 50 60 70

Esti

mat

ed T

otal

Pow

er

Cons

umpt

ion

(mW

)

Clock Frequency (MHz)

Baseline (1-Issue-Slot)

Baseline (2-Issue-Slots)

Baseline (3-Issue-Slots)

Customized (1-Issue-Slot)

Customized (2-Issue-Slots)

0

1

2

3

4

5

6

7

0 10 20 30 40 50 60 70

Esti

mat

ed T

otal

Pow

er

Cons

umpt

ion

(mW

)

Clock Frequency (MHz)

Baseline (1-Issue-Slot)

Baseline (2-Issue-Slots)

Baseline (3-Issue-Slots)

Customized (1-Issue-Slot)

Customized (2-Issue-Slots)

0

1

2

3

4

5

6

7

0 10 20 30 40 50 60 70

Esti

mat

ed T

otal

Pow

er

Cons

umpt

ion

(mW

)

Clock Frequency (MHz)

Baseline (1-Issue-Slot)

Baseline (2-Issue-Slots)

Baseline (3-Issue-Slots)

Customized (1-Issue-Slot)

Customized (2-Issue-Slots)

0

1

2

3

4

5

6

7

0 10 20 30 40 50 60 70

Esti

mat

ed T

otal

Pow

er

Cons

umpt

ion

(mW

)

Clock Frequency (MHz)

Baseline (1-Issue-Slot)

Baseline (2-Issue-Slots)

Baseline (3-Issue-Slots)

Customized (1-Issue-Slot)

Customized (2-Issue-Slots)

0

1

2

3

4

5

6

7

0 10 20 30 40 50 60 70

Esti

mat

ed T

otal

Pow

er

Cons

umpt

ion

(mW

)

Clock Frequency (MHz)

Baseline (1-Issue-Slot)

Baseline (2-Issue-Slots)

Baseline (3-Issue-Slots)

Customized (1-Issue-Slot)

Customized (2-Issue-Slots)

x3.2

x1.1x1.4

x1.2x1.5

These estimations are done for a 40 nm low power technology process

Page 20: New Perspectives for Hearing Aid Hardware Designhadf.hoertech.de/2017/downloads/HADF_Blume_2017.pdf · 2017-06-21 · New Perspectives for Hearing Aid Hardware Design Prof. Dr.-Ing

Prof. Dr.-Ing. Holger Blume, 06.06.2017 Seite 20

Digital Signal Processing Algorithms and Number Systems

Complex numbersReal numbers

Time domain: Frequency domain: filter convolution compression …

FFT …

filter correlation mixer …

Transformations:

Hardware architectures for signal processing should be optimized for both number systems

for performance and efficiency reasons.

Signal Processing Algorithms

Page 21: New Perspectives for Hearing Aid Hardware Designhadf.hoertech.de/2017/downloads/HADF_Blume_2017.pdf · 2017-06-21 · New Perspectives for Hearing Aid Hardware Design Prof. Dr.-Ing

Prof. Dr.-Ing. Holger Blume, 06.06.2017 Seite 21

A combination of real- and complex-valued arithmetic in one SIMD multiply-accumulate (MAC) function unit

Flexible switch between operations

Reuse of hardware multipliers for different operations

Real- and Complex-Valued Multiply-Accumulate (MAC) Functional Unit Implementation

Instruction Memory / Cache PC

Instruction Decoder

Register FileNumber of Ports and Registers

Issue 0

IF/DE

DE/RA

RA/EX1

ALUSIMD

Data Memory / CacheCMAC

Instruction Decoder

Issue 1

Instruction Decoder

Issue 2

SFUSFU

EX1/WB[Gerlach, L.; Payá Vayá, G.; Blume, H.: An Area Efficient Real- and Complex-Valued Multiply-Accumulate SIMD Unit for Digital Signal Processors, SiPS2015

Page 22: New Perspectives for Hearing Aid Hardware Designhadf.hoertech.de/2017/downloads/HADF_Blume_2017.pdf · 2017-06-21 · New Perspectives for Hearing Aid Hardware Design Prof. Dr.-Ing

Prof. Dr.-Ing. Holger Blume, 06.06.2017 Seite 22

Real- and Complex-Valued Multiply-Accumulate (MAC) Functional Unit Implementation

Processor: Kavuaka 32 Point FFT Cycles Core Area

Real-valued SIMD MAC 570 0.237 mm2

Real- and Complex-valued SIMD MAC and Butterfly Operations

135 (Speedup: 4.22 x) 0.255 mm2 (Overhead: 7%)

Butterfly 8 bit mode

Page 23: New Perspectives for Hearing Aid Hardware Designhadf.hoertech.de/2017/downloads/HADF_Blume_2017.pdf · 2017-06-21 · New Perspectives for Hearing Aid Hardware Design Prof. Dr.-Ing

Prof. Dr.-Ing. Holger Blume, 06.06.2017 Seite 23

Real- and Complex-Valued Multiply-Accumulate (MAC) Functional Unit Implementation

FFT Performance of current DSP Architectures

Page 24: New Perspectives for Hearing Aid Hardware Designhadf.hoertech.de/2017/downloads/HADF_Blume_2017.pdf · 2017-06-21 · New Perspectives for Hearing Aid Hardware Design Prof. Dr.-Ing

Prof. Dr.-Ing. Holger Blume, 06.06.2017 Seite 24

Example of a Co-Processor CORDIC (COordinate Rotation

DIgital Computer) coprocessor

Advantages: High flexibility High accuracy Fast computation compared to other

approximation algorithms Reduced memory requirement compared to

look-up-table interpolation

Custom Coprocessors for Hearing Aids

ASIP

PE

PE

PE

PE

PE

PE

PE

PE

PE

Co -Processor Architecture

DMA

IM DM

Audio

InterfaceSerial

Interface

Page 25: New Perspectives for Hearing Aid Hardware Designhadf.hoertech.de/2017/downloads/HADF_Blume_2017.pdf · 2017-06-21 · New Perspectives for Hearing Aid Hardware Design Prof. Dr.-Ing

Prof. Dr.-Ing. Holger Blume, 06.06.2017 Seite 25

CORDIC (COordinate Rotation DIgital Computer) coprocessor Fast computations of non-linear functions

Overall speed up by hardware CORDIC compared to software CORDIC : Binaural feedback suppression 9,62 X Localization 3 X

Custom Coprocessors for Hearing Aids

Hyperbolic and trigonometricoperations

Sine Cosine Exponential Naturallogarithm

Square root

Cycles

KAVUAKA+CORDIC (HW)KAVUAKA (SW)

TI TMS320C6478

71621

1259

71621

1523

76668

1529

56664

1134

59649

341

100%10%

140%

[Gerlach, L.; Nolting, S.; Blume, H.; Payá Vayá, G.; Stolberg, H.; Reuter, C. ] A Highly Optimized Arithmetic Software Library and Hardware Co-processor IP forFixed-Point VLIW-SIMD Processor Architectures, Technology Transfer in Computing Systems (TETRACOM Technology Transfer Project (TTP), 2016)

Page 26: New Perspectives for Hearing Aid Hardware Designhadf.hoertech.de/2017/downloads/HADF_Blume_2017.pdf · 2017-06-21 · New Perspectives for Hearing Aid Hardware Design Prof. Dr.-Ing

Prof. Dr.-Ing. Holger Blume, 06.06.2017 Seite 26

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Baseline (1-Issue-Slot)

Baseline (2-Issue-Slots)

Baseline (3-Issue-Slots)

Customized (1-Issue-Slot)

Customized (2-Issue-Slots)

KAVUAKA

Esti

mat

ed S

ilico

n A

rea

(mm

2 )

Customized Operations

Customized RF

Core

Memory

Exemplary Hearing Aid Processing - ASIP Design Space Exploration

These estimations are done for a 40 nm low power technology process

x0.5

0

1

2

3

4

5

6

7

0 10 20 30 40 50 60 70

Esti

mat

ed T

otal

Pow

er

Cons

umpt

ion

(mW

)

Clock Frequency (MHz)

Baseline (1-Issue-Slot)

Baseline (2-Issue-Slots)

Baseline (3-Issue-Slots)

Customized (1-Issue-Slot)

Customized (2-Issue-Slots)

KAVUAKA

x0.26

Page 27: New Perspectives for Hearing Aid Hardware Designhadf.hoertech.de/2017/downloads/HADF_Blume_2017.pdf · 2017-06-21 · New Perspectives for Hearing Aid Hardware Design Prof. Dr.-Ing

Prof. Dr.-Ing. Holger Blume, 06.06.2017 Seite 27

65 nm TSMC mixed signal process peak power consumption: 1-2 mW average power consumption: less than 1 mW

State-of-the-art Hearing Aid Systems

ON Semiconductor® Ezairo® 7100 http://www.onsemi.com

Page 28: New Perspectives for Hearing Aid Hardware Designhadf.hoertech.de/2017/downloads/HADF_Blume_2017.pdf · 2017-06-21 · New Perspectives for Hearing Aid Hardware Design Prof. Dr.-Ing

Prof. Dr.-Ing. Holger Blume, 06.06.2017 Seite 28

0.13μm SMIC 1P8M CMOS technology Average power consumption: 1.1mW 9.3mm2 (3.1mm x 3mm)

State-of-the-art Hearing Aid Systems

C. Chen et al., "A 1V, 1.1mW mixed-signal hearing aid SoC in 0.13μm CMOS process," 2016 IEEE International Symposium on Circuits and Systems

Dynamic range compression (WDRC) Noise reduction (NR) Feedback cancellation (FDC)

Page 29: New Perspectives for Hearing Aid Hardware Designhadf.hoertech.de/2017/downloads/HADF_Blume_2017.pdf · 2017-06-21 · New Perspectives for Hearing Aid Hardware Design Prof. Dr.-Ing

Prof. Dr.-Ing. Holger Blume, 06.06.2017 Seite 29

TSMC 65nm GP technology Average power consumption: 1.3 mW 3.61 mm2 (1.9mm x 1.9mm)

State-of-the-art Hearing Aid Systems

K. C. Chang, Y. W. Chen, Y. T. Kuo and C. W. Liu, "A low power hearing aid computing platform using lightweight processing elements," 2012 IEEE International Symposium on Circuits and System

Auditory compensation Noise reduction Feedback cancellation

Page 30: New Perspectives for Hearing Aid Hardware Designhadf.hoertech.de/2017/downloads/HADF_Blume_2017.pdf · 2017-06-21 · New Perspectives for Hearing Aid Hardware Design Prof. Dr.-Ing

Prof. Dr.-Ing. Holger Blume, 06.06.2017 Seite 30

Remaining Challenges and Countermeasures

Leakage currents Flexibility and Programmability Verification and Testability

Page 31: New Perspectives for Hearing Aid Hardware Designhadf.hoertech.de/2017/downloads/HADF_Blume_2017.pdf · 2017-06-21 · New Perspectives for Hearing Aid Hardware Design Prof. Dr.-Ing

Prof. Dr.-Ing. Holger Blume, 06.06.2017 Seite 31

Total Power Consumption

ArchitectureDesign Circuit technology

TechnologyCircuit technologySupply voltages

2i i i ii DC

iSCWf C UP σ P

C

DDU

outUinUn

DDU

1

0,5a

1""1" "d 0" " 1""0 0,5

Glitches

1 f

Page 32: New Perspectives for Hearing Aid Hardware Designhadf.hoertech.de/2017/downloads/HADF_Blume_2017.pdf · 2017-06-21 · New Perspectives for Hearing Aid Hardware Design Prof. Dr.-Ing

Prof. Dr.-Ing. Holger Blume, 06.06.2017 Seite 32

As the technology sizes decrease, the leakage currents increase exponentially.

There is a strong dependency on the process / technology.

Leakage Currents

[Veendrick 2007]

Research

In production

I off[A

/μm

]

Gate length [nm]10010 1000

1,0E -04

1,0E -06

1,0E -14

1,0E -10

1,0E -08

1,0E -12

Intel 20 nm Transistor

Intel 30 nm Transistor

Intel 15 nmTransistor

Page 33: New Perspectives for Hearing Aid Hardware Designhadf.hoertech.de/2017/downloads/HADF_Blume_2017.pdf · 2017-06-21 · New Perspectives for Hearing Aid Hardware Design Prof. Dr.-Ing

Prof. Dr.-Ing. Holger Blume, 06.06.2017 Seite 33

CMOS Roadmap

AVAILABLE NOW

IN DEVELOPMENT

MarketsServers, high performance computing and graphics, high-end smartphone, core networking

Premium Tier

FeaturesHigh-performance, balanced-cost

MarketsLow & mid-end smartphones, wireless, IoT, autonomous vehicles, mobile camera

Volume Tier

FeaturesLow-power, cost-effective performance, RF, embedded memory

Wireless,Battery-powered Computing

High-performanceComputing

7nmFinFET

14nmFinFET

28nm

22FDX®

40/55nm

12FDXTM

© 2017 GLOBALFOUNDRIES

Page 34: New Perspectives for Hearing Aid Hardware Designhadf.hoertech.de/2017/downloads/HADF_Blume_2017.pdf · 2017-06-21 · New Perspectives for Hearing Aid Hardware Design Prof. Dr.-Ing

Prof. Dr.-Ing. Holger Blume, 06.06.2017 Seite 34

FinFET & FD-SOI – Same Idea, but different Implementation

Bulk CMOS

Bulk CMOS

LowestCost

FinFET

High PerformanceHigh Density

Ultra-thin Buried Oxide Insulator

Fully Depleted Horizontal Channel

Fully Depleted Vertical Fin

Cost effectiveEmbedded

© 2017 GLOBALFOUNDRIES

Page 35: New Perspectives for Hearing Aid Hardware Designhadf.hoertech.de/2017/downloads/HADF_Blume_2017.pdf · 2017-06-21 · New Perspectives for Hearing Aid Hardware Design Prof. Dr.-Ing

Prof. Dr.-Ing. Holger Blume, 06.06.2017 Seite 35

22FDX® – Ultra-low Voltage

Demonstrated 0.4 Volt Vmin capability for ultra-low power designs

Minimum Switching Energy operating point at around 0.4 Volt 1)

• As Vdd decreases, dynamic power and frequency also goes down• Leakage power also reduces as Vdd drops• PVT Variations are more significant at low Vdd and can be compensated for by back-

gate bias.

0,0

0,2

0,4

0,6

0,8

1,0

1,2

1,4

0,3 0,4 0,5 0,6 0,7 0,8 0,9 1,0

Ener

gy (

norm

)

Vdd (V)

switching energyLeaakge energytotal energy

SLVT, FBB=0.8V

Median:0.64V

Median:0.40V

>200mV

28nm Poly/SiON 22FDX

Logic Vmin

0.4v

0.3v

0.6v

0.5v

0.7v

0.8v

1) Jani Mäkipää, Olivier Billoint; „FDSOI versus BULK CMOS at 28nm node. Which Technology for Ultra-Low Power Design?“ 978-1-4673-5762-3/13/$31.00 ©2013 IEEE

© 2017 GLOBALFOUNDRIES

Page 36: New Perspectives for Hearing Aid Hardware Designhadf.hoertech.de/2017/downloads/HADF_Blume_2017.pdf · 2017-06-21 · New Perspectives for Hearing Aid Hardware Design Prof. Dr.-Ing

Prof. Dr.-Ing. Holger Blume, 06.06.2017 Seite 36

FDSOI – Development and Production Facility FAB1 in Dresden

22FDX™ now in production -12FDX™ In development at FAB1 and production beginning 2019

Production facility FAB1 in Dresden for 28nm, 22nm and 12nm (planned) 3500 Employees, Europe‘s largest modern semiconductor company Development facility for 22nm and 12nm 12 Billion $ cumulative investments since 1996 (first AMD, later GLOBALFOUNDRIES)

1.5 Bill.€ planned investment for capacity extension to 1 Mio wafers/year

Page 37: New Perspectives for Hearing Aid Hardware Designhadf.hoertech.de/2017/downloads/HADF_Blume_2017.pdf · 2017-06-21 · New Perspectives for Hearing Aid Hardware Design Prof. Dr.-Ing

Prof. Dr.-Ing. Holger Blume, 06.06.2017 Seite 37

Verification and Testability of New Architectures

Page 38: New Perspectives for Hearing Aid Hardware Designhadf.hoertech.de/2017/downloads/HADF_Blume_2017.pdf · 2017-06-21 · New Perspectives for Hearing Aid Hardware Design Prof. Dr.-Ing

Prof. Dr.-Ing. Holger Blume, 06.06.2017 Seite 38

Verification and Testability - In-Circuit Emulation

Verification in an early design phase

Tape-out summer 2017 (TSMC 40nm LP Tech.)

Binaural Localization

[Seifert, C.; Thiemann, J.; Gerlach, L.; Volkmar, T.; Payá-Vayá, G.; Blume, H.; van de Par, S.: Real-Time Implementation of a GMM-Based Binaural Localization Algorithm on a VLIW-SIMD Processor (accepted), ICME 2017]

Page 39: New Perspectives for Hearing Aid Hardware Designhadf.hoertech.de/2017/downloads/HADF_Blume_2017.pdf · 2017-06-21 · New Perspectives for Hearing Aid Hardware Design Prof. Dr.-Ing

Prof. Dr.-Ing. Holger Blume, 06.06.2017 Seite 39

In-circuit Emulation of Kavuaka

Stereo Codec

Memory

4 inputchannels

ASIC Socket

Stereo Codec

4 outputchannels

Bluetooth

Audio dataConfiguration/instruction

dataDebugging / parameter data

USB

Battery

Power Management

Replacement of future ASIC by in-circuit emulation of Kavuaka

Allows testing and verification with future periphery

4 inputand 4 output

channels

BT

USBPower

Debug & Measurement

Audio

I2C

I2S

KavuakaASIP UART

FPGA

Batt.Testsocket &

Memory

Page 40: New Perspectives for Hearing Aid Hardware Designhadf.hoertech.de/2017/downloads/HADF_Blume_2017.pdf · 2017-06-21 · New Perspectives for Hearing Aid Hardware Design Prof. Dr.-Ing

Prof. Dr.-Ing. Holger Blume, 06.06.2017 Seite 40

New Research Project Addressing the Aforementioned Challenges

Smart HeaP Smart HeaP

Page 41: New Perspectives for Hearing Aid Hardware Designhadf.hoertech.de/2017/downloads/HADF_Blume_2017.pdf · 2017-06-21 · New Perspectives for Hearing Aid Hardware Design Prof. Dr.-Ing

Prof. Dr.-Ing. Holger Blume, 06.06.2017 Seite 41

System-on-Chip as Platform for further Innovation

ASIPTensilica HiFi

A/D

CPU

A/D

Bluetooth

SmartHeaP ASIC

0.4V 2.5V

Microphones Loudspeakers

Page 42: New Perspectives for Hearing Aid Hardware Designhadf.hoertech.de/2017/downloads/HADF_Blume_2017.pdf · 2017-06-21 · New Perspectives for Hearing Aid Hardware Design Prof. Dr.-Ing

Prof. Dr.-Ing. Holger Blume, 06.06.2017 Seite 42

New Project: Smart HeaP

AudiologyAlgorithms

Architectures

Hearing Aids

Technology

ASIP-Framework

Project management / SoC Design

Smart HeaP

Smart HeaP

Page 43: New Perspectives for Hearing Aid Hardware Designhadf.hoertech.de/2017/downloads/HADF_Blume_2017.pdf · 2017-06-21 · New Perspectives for Hearing Aid Hardware Design Prof. Dr.-Ing

Prof. Dr.-Ing. Holger Blume, 06.06.2017 Seite 43

Different influencing factors on power consumption Architecture optimization Custom instructions Co-processor

Remaining challenges and countermeasures Leakage currents Flexibility and Programmability Verification and Testability

All of these issues will be addressed in Smart HeaP

Summary