System Design
I. DecPeRle-1 board
The project is to implement the function decomposition machine in hardware, our
design is targeted to a reconfigurable PAM -- the DECPeRLe-1 board. It is a
FPGA-based reconfigurable coprocessors called Programmable Active Memories(PAM).
It is based on a 4x4 array of XC3090 FPGAs (Fig. 1). FPGA-based processors can exploit the
fact that most of the processing time for compute-intensive task is spent in a
relatively small portion of the code, and hardware accelation of that portion
can significantly improve the overall performance.
Fig. 1 DECPeRLe-1 Board Architecture
II. Design Principle
Based on the above idea, the priciples we follow during design are:
We only dump input function from the host computer once, all decomposition
job are implemented by pure hardware.
During hardware design, each stage in the function decomposition algorithm
can use resources at the DECPeRLe-1 board as much as possible, the board can be
reprogramming in 'ns'.
Since the resources are enough, we can use pipeline, iterative parallel
architecture in our design of the implementation of each stage as much as
possible, make the design as regular as possible will benifit FPGA-based
structure.
Interfacing data format between each stage must be compatible and consistant.
III. System Overview
After all, the block diagram of the system looks like this:
Fig. 2 System Design -- Block Diagram
In the above block diagram, we partition the function decomposition into 5 stages.
The top-level controller ( Global Control Unit ) controls the flow of these 5
stages and conmunicates with the host computer. The circuitry of each stage is
implemented by programming the board by the host computer. The input and output
data between each stage are stored in the 4 memory banks around the logic cell
arrays. From mapping point of view, the GCU is easly mapped to the CNE/CSW which
has connection with input/output FIFO. The control unit of each stage is also
easily mapped to these two units, while it is convinient to implement datapath
by the 4x4 LCA. The block diagram also shows the data flow between each stage.
IV. GCU Design
The GCU is a overall controller controls the interacting of 5 stages. Following is a
descripteion of the main steps :
0. host computer program the first stage circuit.
1. input data are dumped into input FIFO
Boolean functions are represented by minterm products, only cares are given. So the input data
will include:
# of input variables -- n
# of pruduct terms -- m
product term p1
product term p2
...
product term pm
Assert signal system_start(deactivate FifoInEmpty)
2. upon asserting signal system_start:
1). fetch # of input(n) to R1
2). fetch # of product(m) to R2,'
3). store n to memA(0)
4). store m to memA(1)
5). initialize counter C=1,
6). fetch p(C) to R3,
7). store to memA(2)
8). increment C
9). if C=M, goto 10
else, goto 6
10). Assert signal VP_start
3. upon asserting signal VP_start ( Variable Partitioning )
1). wait until VP_done
2). store results to memB, which include:
# of variables in bound set -- k
decomposition chart (2^k columns)
4. assert next_stage signal to host computer, wait until board is programmed.
5. assert cg_start signal ( Compatility Graph )
1). wait until cg_done
2). store results to memC, which include:
The compatability graph
6. assert next_stage signal to host computer, wait until board is programmed.
7. assert mc_start signal ( Maximum Clique )
1). wait until mc_done
2). store results to memD, which include:
The encoded columns for the comparability graph
8. assert next_stage signal to host computer, wait until board is programmed.
9. assert gf_start signal ( generate G Function )
1). wait until gf_done
2). store results to memA(overwrite original Truth Table), which include:
Truth table of G function
3). also write result to output FIFO
10. assert next_stage signal to host computer, wait until board is programmed.
11. assert hf_start signal ( generate H Function )
1). wait until gf_done
2). store results to memC(input information is taken from A and B), which include:
Truth table of H function
3). also write result to output FIFO
12. assert system_done signal to host computer.