|
马上注册,结交更多好友,享用更多功能,让你轻松玩转社区。
您需要 登录 才可以下载或查看,没有账号?注册
x
Contents
Acknowledgments
About the Authors
1 Introduction 1
2 Focus and Related Work 5
2.1 Focus of This Work . . . . . . . . . . . . . . . . . . . . 5
2.2 Previous Work . . . . . . . . . . . . . . . . . . . . . . . 6
2.2.1 ASIP Design Methodologies . . . . . . . . . . 6
2.2.2 ASIP Case Studies . . . . . . . . . . . . . . . 10
2.2.3 Basic Low-Power Design Techniques . . . . . 11
2.2.4 Verification . . . . . . . . . . . . . . . . . . . 14
2.3 Differences to Previous Work . . . . . . . . . . . . . . . 15
3 Efficient Low-Power Hardware Design 17
3.1 Metrics of the Implementation and the Hardware Design
Methodology . . . . . . . . . . . . . . . . . . . . . . . . 17
3.1.1 Characteristics of the Implementation . . . . . 18
3.1.2 Characteristics of the Design Methodology . . 20
3.2 Basics of Low-Energy Hardware Design . . . . . . . . . 22
3.2.1 Sources of CMOS Energy Consumption . . . . 23
3.2.2 Basic Principles of Lowering the Power Consumption
. . . . . . . . . . . . . . . . . . . . 26
xi
xii
Foreword
List of Figures
List of Tables
xiii
xv
xix
vi Contents
3.2.3 Measuring and Quantifying Energy-Efficiency 28
3.3 Techniques to Reduce the Energy Consumption . . . . . 32
3.3.1 System and Architecture Level . . . . . . . . . 33
3.3.2 Register Transfer and Logic Level . . . . . . . 36
3.3.3 Physical Level . . . . . . . . . . . . . . . . . 40
3.4 Concluding Remarks . . . . . . . . . . . . . . . . . . . 41
4 Application-Specific Processor Architectures 43
4.1 Definitions of ASIP Related Terms . . . . . . . . . . . . 43
4.2 ASIP Applications . . . . . . . . . . . . . . . . . . . . . 46
4.3 ASIP Design Space . . . . . . . . . . . . . . . . . . . . 48
4.3.1 Functional Units . . . . . . . . . . . . . . . . 51
4.3.2 Storage elements . . . . . . . . . . . . . . . . 52
4.3.3 Pipelining . . . . . . . . . . . . . . . . . . . . 53
4.3.4 Interconnection Structure . . . . . . . . . . . . 55
4.3.5 Control Mechanisms . . . . . . . . . . . . . . 56
4.3.6 Storage Access . . . . . . . . . . . . . . . . . 58
4.3.7 Instruction Coding and Instruction Fetch Mechanisms
. . . . . . . . . . . . . . . . . . . . . 59
4.3.8 Interface Mechanisms . . . . . . . . . . . . . 61
4.3.9 Tightly-Coupled ASIP Accelerators . . . . . . 64
4.4 Critical Factors for Energy-Efficient ASIPs . . . . . . . . 65
4.4.1 Timing and Computational Performance . . . . 65
4.4.2 Energy Consumption . . . . . . . . . . . . . . 68
4.4.3 Implementation Area . . . . . . . . . . . . . . 73
4.5 Concluding Remarks . . . . . . . . . . . . . . . . . . . 74
Contents vii
5 The ASIP Design Flow 75
5.1 Example Applications . . . . . . . . . . . . . . . . . . . 76
5.2 Application Profiling and Partitioning . . . . . . . . . . . 80
5.2.1 Stimulus Generation for Application Profiling . 80
5.2.2 Application Profiling . . . . . . . . . . . . . . 81
5.2.3 HW/SW Partitioning . . . . . . . . . . . . . . 87
5.2.4 ASIP Class Selection . . . . . . . . . . . . . . 89
5.3 Combined ASIP HW/SW Synthesis and Profiling . . . . 93
5.3.1 ASIP Interface Definition . . . . . . . . . . . 94
5.3.2 ASIP ISA Definition . . . . . . . . . . . . . . 96
5.3.3 Software Implementation and Tools . . . . . . 97
5.3.4 Hardware Implementation and Logic Synthesis 99
5.3.5 Implementation Profiling and Worst Case Runtime
Analysis . . . . . . . . . . . . . . . . . . 100
5.3.6 Iterative ASIP Optimization . . . . . . . . . . 102
5.3.7 Definition of a tightly coupled ASIP Accelerator 109
5.4 Verification . . . . . . . . . . . . . . . . . . . . . . . . . 111
5.5 Concluding Remarks . . . . . . . . . . . . . . . . . . . 116
6 The ASIP Design Environment 117
6.1 The LISA Language . . . . . . . . . . . . . . . . . . . . 117
6.2 The LISA Design Environment . . . . . . . . . . . . . . 123
6.3 Extensions to the LISA Design Environment . . . . . . . 125
6.3.1 Instruction Encoding and Decoder Generation . 125
6.3.1.1 Minimization of the instruction width 127
6.3.1.2 Minimization of the Toggle Activity 131
viii Contents
6.3.2 Semi-Automatic Test Case Generation . . . . . 138
6.4 Concluding Remarks . . . . . . . . . . . . . . . . . . . 143
7 Case Studies 145
7.1 Case Study I: DVB-T Acquisition and Tracking . . . . . 145
7.1.1 Application Profiling and ASIP Class Selection 147
7.1.2 Iterative Instruction Set Optimization . . . . . 149
7.1.2.1 Example 1: Saturation . . . . . . . . 149
7.1.2.2 Example 2: CORDIC . . . . . . . . 151
7.1.3 Overall Energy Optimization Results . . . . . 153
7.2 Case Study II: Linear Algebra Kernels and Eigenvalue Decomposition
. . . . . . . . . . . . . . . . . . . . . . . . 156
7.2.1 Implementation I: Optimized ASIP with Accelerator
. . . . . . . . . . . . . . . . . . . . . . 157
7.2.2 Implementation II: Compiler-Programmed Parameterizable
Core with Accelerator . . . . . . 161
7.2.3 Evaluation Results . . . . . . . . . . . . . . . 163
7.3 Concluding Remarks . . . . . . . . . . . . . . . . . . . 165
8 Summary 167
A ASIP Development Using LISA 2.0 171
A.1 The LISA 2.0 Language . . . . . . . . . . . . . . . . . . 171
A.2 Design Space Exploration . . . . . . . . . . . . . . . . . 173
A.3 Design Implementation . . . . . . . . . . . . . . . . . . 175
A.4 Software Tools Generation . . . . . . . . . . . . . . . . 177
A.4.1 Compiler Generation . . . . . . . . . . . . . . 177
A.4.2 Assembler and Linker Generation . . . . . . . 178
Contents ix
A.4.3 Simulator Generation . . . . . . . . . . . . . . 179
A.4.3.1 Interpretive Simulation . . . . . . . 181
A.4.3.2 Compiled Simulation . . . . . . . . 181
A.4.3.3 Just-In-Time Cache Compiled Simulation
(JIT-CCS) . . . . . . . . . . . 181
A.5 System Integration . . . . . . . . . . . . . . . . . . . . . 183
A.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 184
B Computational Kernels 185
B.1 The CORDIC Algorithm . . . . . . . . . . . . . . . . . 185
B.2 FIR Filter . . . . . . . . . . . . . . . . . . . . . . . . . 187
B.3 The Fast Fourier Transformation . . . . . . . . . . . . . 188
B.4 Vector/Matrix Operations . . . . . . . . . . . . . . . . . 188
B.5 Complex EVD using a Jacobi-like Algorithm . . . . . . . 190
C ICORE Instruction Set Architecture 193
C.1 Processor Resources . . . . . . . . . . . . . . . . . . . . 193
C.2 Pipeline Organization . . . . . . . . . . . . . . . . . . . 193
C.3 Instruction Summary . . . . . . . . . . . . . . . . . . . 198
C.4 Exceptions to the Hidden Pipeline Model . . . . . . . . . 202
C.5 ICORE Memory Organization and I/O Space . . . . . . . 203
C.6 Instruction Coding . . . . . . . . . . . . . . . . . . . . . 203
D Different ICORE Pipeline Organizations 205
E ICORE HDL Description Templates 207
E.1 Generic Register File Entity . . . . . . . . . . . . . . . . 207
E.2 Generic Bit-Manipulation Unit . . . . . . . . . . . . . . 209
x Contents
F Area, Power and Design Time for ICORE 213
G Acronyms 217
Bibliography 221 |
|