文章基本信息

标题：Power optimization of combinational circuits mapped on LUT-based FPGAS.
作者：Bucur, Ion ; Stefanescu, Costin ; Cupcea, Nicolae 等
期刊名称：Annals of DAAAM & Proceedings
印刷版ISSN：1726-9679
出版年度：2009
期号：January
语种：English
出版社：DAAAM International Vienna
摘要：Power consumption is becoming one of the most important considerations in VLSI design. High power often run hot and high temperature tends to exacerbate several silicon failure mechanisms. In 90nm technology, dynamic power is still the leading power cause in FPGAs. Therefore, in addition to performance and area optimization a great deal of research has been directed towards issues related to the low power FPGA circuit mapping. Primary focus in this paper is on searching optimized solutions primarily for depth, and dynamic power at the logic level. Secondary target is on minimal depth, area and dynamic power at the same level. It is introduced a new power optimization mapper tool for K-LUT based FPGA.
关键词：Digital integrated circuits;Energy consumption;Integrated circuits;Mathematical optimization;Optimization theory;Programmable logic arrays;Semiconductor chips

Power optimization of combinational circuits mapped on LUT-based FPGAS.

Bucur, Ion ; Stefanescu, Costin ; Cupcea, Nicolae 等

1. INTRODUCTION

Power consumption is becoming one of the most important considerations in VLSI design. High power often run hot and high temperature tends to exacerbate several silicon failure mechanisms. In 90nm technology, dynamic power is still the leading power cause in FPGAs. Therefore, in addition to performance and area optimization a great deal of research has been directed towards issues related to the low power FPGA circuit mapping. Primary focus in this paper is on searching optimized solutions primarily for depth, and dynamic power at the logic level. Secondary target is on minimal depth, area and dynamic power at the same level. It is introduced a new power optimization mapper tool for K-LUT based FPGA.

2. RELATED WORK

There have been done several works for decreasing the power consumption in circuits mapped with FPGAs. Farrahi and Sarrafzadeh studied the technology mapping problem for lookup table-based FPGAs (Farrahi & Sarrafzadeh, 1994). The problem is formulated as assigning LUTs to nodes of a circuit so as to minimize the total estimated power consumption. They did show that the decision version of this problem is NP-complete, even for simple classes of inputs such as 3-level circuits. Anderson and Najm proposed a power-aware technology mapping technique for LUT-based FPGAs which aims to keep nets with high switching activity out of the FPGA routing network and takes an activity-conscious approach to logic replication (Anderson & Najm, 2004).

3. FPGA SPECIFIC POWER ISSUES

FPGAs circuits are developed in CMOS technology. Power dissipation in CMOS circuits comprises both static (leakage) power and dynamic (functional and spurious) power. Dynamic power is caused by transitions on signals of a circuit and is governed by the equation:

[P.sub.d] = 0.5 [[summation].sub.i] [C.sub.i] x [V.sup.2.sub.dd] x d(i) (1)

Where [C.sub.i] is the capacitance of a signal, i; d(i) is referred to as "transition activity" of logic signal i and represents the rate of transitions on signal i (i.e. the number of times that signal i changes its value in unit time) ; and [V.sub.dd] is the supply voltage. The programmability and flexibility of FPGAs are making them less power-efficient than custom ASICs when considering the implementation of a given logic circuit (Ho et al., 2008).

4. DEFINITIONS AND MAIN MODEL

Combinational part of a general logic network N can be represented as a direct acyclic graph (DAG) noted as N(V, E) where V is the set of nodes and E is the set of directed edges. Each node in N(V, E) represents a logical gate (possible complex), and a direct edge (u, v) exists if the output of gate u is an input of the gate v. The set of direct predecessors of gate v is expressed as input(v) and the set of direct predecessors of a graph H [subset or equal to] N(V, E), is similar expressed as input(H). The level of each node u is the largest level of the predecessors of it, incremented. The level of a PI node is zero and the level (depth) of a network is the largest node level in the network. Various approaches to computing dynamic switching activity have been proposed in the literature, and they can be, considered as either simulation-based approaches, or as probability approaches. Probability approach, in networks where there are re-convergent fanouts, is computing an upper bound of line probability. This technique introduces errors and do not detect logic spurious activity. In this paper is used simulation-based approach for 15 combinational circuits of the MCNC benchmark. It was designed and implemented the simulator used for simulation-based logic activity analysis. This simulator has capabilities for capturing the number of logic transitions on each net during simulation, as well as the proportion of time each net spends in the high and low logic states. Circuits are simulated using almost 10 000 randomly generated input vectors. Each time an internal line u is incrementing the logic activity (the number of changes from 1 to 0 or from 0 to 1), it is evaluated the expression:

[absolute value of n(u)/N - n(u) + 1/N+M] [less than or equal to] [epsilon] (2)

In (2) was noted with N the number of simulated vectors after that the number of logic changes, on an arbitrary line u, became n(u). Let be N + M the number of simulated vectors when the number of logic changes, on line w, is incremented (n(u) + 1). From (2) it results that:

[lim.sub.N.[less than or equal to][infinity]] n(u)/N = d(u) (3)

Parameter e (1 >> [epsilon] >0) in (2) is introduced at beginning of the simulation and it represents the precision of simulated-based logic activity determination. Simulator is checking when all internal lines and primary output lines of the circuit fulfill condition (2) and ends the simulation. Computed dynamic activity is based on identical switching activities (0.5) for each primary input. In K-LUT based FPGAs circuits the essential dynamic power consumption is caused by transitions that take place at the inputs and outputs of LUTs. Nets having the greatest transition density have to be, mostly, hidden in LUTs.

5. IMPLEMENTD ALGORITHM

The application, named PwOptMap, is built-up using structures and routines of SIS-1.2 (*** 1994) and appropriate cost functions for power-aware minimal depth mapping. It was used the K-feasible-cones generation method (using K =5). The K-feasible cones generation is applyed during network traverse from primary inputs to primary outputs (Bucur 2007).

Next step, same network is traversed from primary outputs to the primary inputs, starting with the primary outputs having largest mapped depth. The apropriate selection among the K-feasible cones of each node is guided using critical path and several appropriate costs. These costs are using data related to the number of internal nodes of each cone, the count of internal nodes having fan-outs in other feasible cones, etc.

Depth Metric (DpthMtr) for each node u is computed over the best depth K-feasible cone of u (denoted c(u)):

DpthMtr(c(u)) = 1 + min (ppthMtr(v|v [member of] c(u))) (4)

This metric is used to quantify the network depth.

The dissipated power is quantified using the estimated power cost (EPC). It is locally applied for the entire set of K-feasible cones (denoted C(u)), of each node u and it is computed using the expression:

EPC(c(u))= [min.sub.c(u)[epsilon]c(u){[[summation].sub.v[member of]Input(c(u))] d(v) x fanout(v) x EPC(C(v))} (5)

Globally used cost, of each node u, is defined as a weight sum of the depth metric and of the estimated power cost:

globalCost = [[mu].sub.1] x DpthMtr{c{u)) + [[mu].sub.2] x EPC(c(u)) (6)

6. EXPERIMENTAL RESULTS

The two parameters, [[mu].sub.1] and [[mu].sub.2], in (6) were experimentally determined. Best results were obtained when [[mu].sub.2] << [[mu].sub.1] reflecting the preference for optimizing depth over power. Our implemented algorithm did run for mapping into 5-LUT FPGAs several benchmark circuits, listed in Tab. 1. To estimate power consumption using (1) it is required the capacitance of each net. Our attempt was to build-up a tool able to evaluate medium-grain different mapping choices during logic optimization, the estimated dynamic power (EDP) for each node u was simply computed as the product of the node's transition density d(u) and the fan-out of it:

EDP(u) = d(u) x fanoutfu) (7)

PwOptMap is an efficient algorithm being able to compute several low-power optimal options, as can be seen in Table 1. The first option keeps optimum depth and finds power-aware equivalent solutions.

The second option is searching, on the user's option, one of the solutions with optimal depth but performing with improved power consumption. For nodes situated on the critical paths of irrespective networks the optimal depth was computed using this relation:

[optimalDepth(u).sub.u[member of]crltlcalPath] = optimumDepth + [delta] (8)

Values listed in the second column of Tab. 1 were computed using [delta] =1 for nodes belonging to the critical path, while for other nodes, not situated on critical paths, the optimal depth values were at most less or equal to the optimal depth of the circuit.

Area minimization is of vital importance for FPGA mapping. Mapping with optimum depth is searched a power-aware solution having minimal area (number of used LUTs). The third solution targets an optimal area and depth while keeping in low margin the dissipated power (illustrated in the third column of Tab. 1).

7. CONCLUSION AND FUTURE RESEARCH

On average, the detailed experimental results are showing that power-aware mapping for optimal depth the estimated dissipated power is 7.54% less than mapping for optimum depth. Concluding, relaxing mapping conditions for circuits' depth it is leading to less dissipated power. Future research will include implementation of elaborate capacitance models (Anderson & Najm, 2004) and improved cost functions.

8. REFERENCES

Anderson, J.H. & Najm, F.N. (2004). Power Estimation Techniques for FPGAs. IEEE Tran. on VLSI Systems, Vol. 12, No. 10, (Oct. 2004), pp. 1015-1027, ISSN: 1063-8210

Bucur, I. (2007). Performance mapping of k-LUT FPGAs. University Politehnica of Bucharest Scientific Bulletin, Series C: Electrical Engineering, Vol. 69, No. 2, (2007), pp. 49-60, ISSN: 1454-234x

Farrahi, A.H. & Sarrafzadeh, M. (1994). FPGA technology mapping for power minimization, In: Field-Programmable Logic Architectures, Synthesis and Applications, Hartenstein, R.W. & Servit, M.Z., (Eds), pp.66-77, Springer-Verlag, ISBN:3-540-58419-6, Berlin / Heidelberg

Ho, C.H.; Leong, P.H.W.; Luk W. & Wilton, S.J.E. (2008). Rapid estimation of Power Consumption for Hybrid FPGAs, Available from: http://www.cse.cuhk.edu.hk/~phwl /mt/public/archives/papers/hpwr_fpl08.pdf Accessed: 2009-07-14

*** (1994). SIS-1.2, Synthesis of both synchronous and asynchronous sequential circuits, Available from: http://embedded.eecs.berkeley.edu/pubs/downloads/sis/inde x.htm, Accessed on: 2009-07-14

Tab. 1. Experimental results of PwOptMap mapping tool

 Estimated Dynamic Power

Circuit Depth Optimum Optimal Optimal Depth
 Depth Depth & Optimal Area

 5xp1 3 2,92 2,01 1,56
9symml 5 3,89 2,91 2,65
 C499 5 10,76 8,22 7,96
 C880 8 16,69 14,67 14,04
 alu2 8 14,05 12,89 12,25
 apex6 4 22,62 20,58 20,16
 apex7 4 10,45 8,92 8,43
 count 3 3,52 3,04 2,50
 des 5 99,67 96,11 97,44
 duke2 4 9,21 8,64 8,93
misex1 2 2,21 2,11 2,19
 rd84 4 4,58 4,12 4,47
 rot 6 35,51 35,03 34,73
 vg2 4 4,32 3,01 3,95
 z4ml 3 1,49 1,38 1,32
 241,89 223,64 222,58