FPGA  SYNTHESIS  FOR  A  SORTER  ALGORITHM  USING  LEONARDO

                                               Vemulapalli  Vikranth  (vemulapv@ee.pdx.edu)
 
                                                               Edith  A  Vedanayagam  (edith@ee.pdx.edu)
 
 
 

CONTENTS

Introduction
 
 
 

INTRODUCTION

In this project we are going to use a BIST inserted sorter circuit and with the help of Mentor's tool exemplar Leonardo synthesize it  to a vendor specific netlist for a particular FPGA technology which in our case being xilinx.
         Leonardo provides an open and interactive synthesis environment to handle complex designs. Its optimization engine provides full constraint based timing optimization and the optimization can be continued by manipulation on the user's side till the desired results are obtained. Moreover Leonardo is the only synthesis tool that offers dual ASIC and FPGA output through a single design flow on both UNIX and windows NT operating systems.
Advantages:
                                      ASIC, FPGA, and CPLD optimization
                                       Mixed Verilog, VHDL and EDIF synthesis
                                       Design re-use with FPGA and ASIC device retargeting
                                       Novice and expert modes of operation
                                       Constraint-based timing optimization
                                       Hierarchy manipulation
                                       Hierarchical design browser
                                       RAM and counter inferencing
                                       Constraint-based static timing analysis
                                       Schematic viewer with critical path highlighting
                                       Open and interactive design environment
                                       Support for EDIF, XNF, VHDL, and Verilog netlists
                                       Node-locked and floating licenses on heterogeneous
                                       platforms
                                       Windows  and Unix support
 

Features of Leonardo:
Good quality of results
FPGA Architecture Specific Technology (FAST) optimization technology dramatically improves optimization run times while maintaining the highest quality of results in the  industry. Optimization algorithms and settings are selected based on area and timing goals and the target technology. F.A.S.T. enables Leonardo to eliminate unnecessary and unproductive optimizations.

RAM Inference from RTL
 Leonardo simplifies the process of incorporating RAMs into design using logic synthesis. Synchronous and asynchronous technology-specific RAMs are automatically inferred from a 2-dimensional array, eliminating the need to hand instantiate while maintaining code portability.

Module Generation
High-level design allows the use of operators such as add,  subtract, multiply, compare and increment. These functions may be inefficient to synthesize as random logic, and an ideal implementation for the technology is generally already known. Leonardo's Module Generation (MODGEN) automatically detects many common arithmetic relational and other data flow operations and uses a technology-specific  implementation. This reduces synthesis time and improves results. Module generators are optimized for different architectures and include area and delay trade-offs.

Extensive HDL Language Support
Leonardo with Extreme Technology offers the richest set of synthesizable VHDL and Verilog constructs in the industry, allowing users to take full advantage of their power.

Interactive Design Environment
  Leonardo allows users to selectively perform optimizations on a block-by-block basis. When a design doesn't meet timing, a single sub-block can be identified through timing analysis and optimized separately. This eliminates unnecessary optimizations on non-timing-critical blocks,  resulting in much smaller designs that meet timing.

Hierarchy Manipulation
Leonardo supports complete hierarchy manipulation. Hierarchy can be preserved throughout the optimization process. The "group" command can be used to combine two or more hierarchical blocks into a single block. The "ungroup" command can be used to selectively dissolve individual  blocks of hierarchy

ASIC Optimization
State of the art area and timing optimization algorithms combined with Exemplar's new Extreme optimization technology makes Leonardo the fastest and most productive ASIC synthesis tool on the market today.
                    Advanced timing analysis employs lookup table based on non-linear delay models to accurately predict pre-layout delays for deep sub-micron ASIC technologies. SDF is  supported for post- layout timing analysis.

User Interface
Leonardo's graphical user interface is designed to simplify the process of interactive synthesis.
 

Leonardo can be run using the graphical user interface or in batch mode. An extensive command line language supporting TCL allows rapid generation of powerful optimization scripts.

Supported ASICs & FPGAs
                                  ASICs More than 60 ASIC libraries from over 25 vendors -
                                  Check the Exemplar web site for the current list of ASIC
                                  libraries.

                                   FPGAs and CPLDs
                                   Actel ACT1, ACT2, ACT3, 1200XL, 3200DX, 40MX, 42MX
                                   Atmel 6K02, 6K04
                                   Altera MAX5000, MAX7000, MAX9000,FLEX 6000, FLEX 8000,
                                   FLEX10K
                                   Cypress C340, C370, C380
                                   Lattice pLSI
                                   Lucent Technologies 3000, ORCA 1C, 2C, 2CA, 2TA, 3C/3T
                                   Motorola MPA 1000
                                   QuickLogic pASIC1, pASIC2, pASIC3
                                   Xilinx 3000, 3000A, 3000L, 3100, 3100A, 4000, XC3000,
                                   XC4000, XC4000E,
                                   XC4000EX, XC4000L, XC4000XL, XC5200, XC7200A, XC7300,
                                   XC9500, Spartan
 

LEONARDO PROCESS STEPS:

The following are the elements involved in the tool architecture.

LOGIC SYNTHESIS:
First the  analyze command checks the VHDL description of the circuit and creates dependency information. then the elaborate command constructs the design from the separately analyzed pieces and creates a generic gate level description. On the other hand the read command can do both of these on flat VHDL designs.

PRE OPTIMIZATION
Simple optimizations such as constant propagation and unused logic elimination is done here. But is also a part of normal optimization and need not be done separately.

RESOLVE MODGEN
This step selects and instantiates technology-mapped logic for any operators in the design. The selection is done based on circuit topology and user input.

OPTIMIZATION
The rest of the design is technology mapped and optimized. Although area optimization is always done , the optimizer can be instructed to create structures that tend to be smaller or faster.

TIMING OPTIMIZATION
Timing optimization uses constrains and static timing analysis to concentrate the optimization effort on critical paths in the design.

SPECIAL ALGORITHMS
These tools are used based on the target technology like "pack CLBs " and "Decompose LUTs" which are used for a lookup table technology.

NETLISTERS INTERFACE
These are net listers for specific technology formats or for more generic formats like EDIF or structural VHDL

SCHEMATIC VIEWER
Helps to examine the design in memory through generated schematics. The schematic viewer interfaces with the timing analyzer to allow graphical examination of critical paths. EDIF netlists of the generated schematic can be produced from within the tool.

THE PROCESS:

SETTING THE ENVIRONMENT VARIABLES.:

Exemplar's Leonardo4.2 has been installed on the Solaris2 boxen in the PSU labs. To install the tool first add the package ' leonardo4.2'.

Once the package has been installed the environment necessary has to be set. The required variable definitions are EXEMPLAR which specifies the path to the exemplar software tree, and LM_LICENSE_FILE which specifies the pathname to the flex-lm application server.
For this do the following in the UNIX shell if this is the first time your are working on leonardo. INVOKING THE TOOL:
There are two modes in which the tool can be worked on.
The user interface mode can be invoked with the command leonardo. It is particularly useful for new users and if the tasks are to be repeated later. The command line interface mode is usually used by those familiar with leonardo and if there is a need to do quick synthesis or to use remote access terminal. This is invoked by the command elsyn For simplicity we will use the user interface mode

OPTIMIZING FOR AREA :

In this the leonardo tool performs area optimization and by going into the details we get to know how to affect the optimization results by manipulating hierarchy and setting various optimization options.
Leonardo Libraries: Leonardo uses different cells from the specified libraries for its processes.
Synthesis Libraries which are a number of FPGA and ASIC libraries provided by Exemplar from vendors like Actel, Altera, Atmel, Lucent, Xilinx, etc. are available. Depending on the target technology they have to be specified for the tool to use from. In our case our interest is XC3090 and so we choose the Xilinx 3000 as our target library.
 

Modgen Libraries or module generator libraries are used by leonardo to resolve instances of arithmetic and relational operators in the design during file read operations or inferred when extracting counters and RAMs with the pre optimize command or in other words they are technology specific operator implementations. The next stage is to specify the file which has the design to be optimized. The read command reads the specified file and creates designs from it. It is also required to specify the library to use. But in the tool bar just clicking on Read and then giving the file name will automatically make the tool do it. The read command not only reads the file to the work directory but it also synthesizes the code and produces a synthesized output for optimization.
The Pre-optimize command allows the results of synthesis to be quickly optimized without loading a library and its custom optimization routines. It does not modify hierarchy, does not map the design to technology and operators are not implemented with technology specific cells. Leonardo treats operators as blackboxes during VHDL synthesis. Each blackbox operator uses a naming convention which supplies information about the operator. During Resolve Modgen, the tool generates operator netlists to replace blackboxes according to your constrains.The resolve modgen command inserts logic into blocks created by operators in your original design. When the Modgen library is loaded , Modgen populates the operators with technology-mapped cells. If not generic implementations result. To improve results the modgen library can be created or modified. Optimization is a process of partitioning the circuit, running specific algorithms and testing to see if improvements are made. Each of the optimization passes run a specific set of algorithms starting with the unmapped design. Basically four are done
 BDD construction - Decomposes the logic into a tree of decision blocks
 Factoring - combining like terms to reduce area
 Circuit Restructuring - a global technique
 Remapping - utilizing wider gates
The optimization can be done in two ways.
Standard optimization runs four passes for FPGA and three for ASIC while the Quick efforts passes just one pass.
This runs optimization passes on the design and maps the circuit into the target library. Each hierarchical block is optimized separately. area and timing for each pass by block are reported. The options are remap-simply map, quick, standard, and chip or macro which is use chip if you are optimizing top level when leo automatically inserts IO buffers or uses macro when optimizing a sub block, area or delay when it does either area mapping or timing based mapping and pass or nopass when it includes or excludes specific passes.
The optimization results and methods can be affected by - grouping the hierarchical blocks, flattening the design, specifying instances in the design as 'don't touch' thereby preventing optimization, unfolding designs and controlling signal fanout. On this process completion we get the following results in the screen . We have it in file area_results_file
The results of this process are saved using the write command. The output was written in the file area.vhd.

OPTIMIZING FOR PERFORMANCE:
 Optimizing the circuit for performance has its own advantages and disadvantages. It involves using algorithms to reduce overall circuit delay or circuit slack time. But it in effect increases overall circuit area. It includes mapping to high macro cells or resizing and incresing parallelism in circuit and controllobility factoring and redudant logic.
       To improve timing performance Leonardo optimizes each block in the force mode. This may not be advantageous in certain instances when blocks without critical paths are optimized which leads to unnecssary increase in area. A general optimize timing command is a full constraint driven optimizer in that the optimization is concentrated on only the paths that violate timing attempting to reduce the overall slack time to zero.
 leonardo uses static timing analysis to calculate critical paths which is calculating delay values by adding delay values along a path without simulating the circuit functionally.
Gate sizing  done by leonardo looks into the capacitive load on the nets, avoid fanout violations,and timing constrains.This does not change the structure of the design.
Controllability Factoring when applied by leonardo involves critical path selection based on slack time, restructuring circuit in the selected critical path section and making the decision of retaining the result or not depending on the cost.
CONSTRAINT EDITOR
Certain performance constrains are to be set before the optimization involving clock, input arrival time and output required time

The constraint editor enables you to specify various optimization constrains for ports or nets. There can be more than one clock defined in the design or multiple clocks in the design are supported. Select the clock signal and define its specifications.The input ports in the design can be specified with the input arrival times. Similarly the time in which the output can also be specified . Apart from using the Constraint editor these can also be specified using the set_attribute commands.
Now its ready for optimization. the results are reported by the report command. The results are found int this file delay.vhd.
 
At this point th esynthesized and the optimized result is saved using the write command The results are found in the file result.vhd .
 

SCHEMATIC VIEWER:

To get a more realistic view of the picture invoke the schematic viewer.

The netlist of the schematic is found here. temp.edf
 XILINX Backend Flow:
Mapping of the lookup tables makes it easier for leonardo to utilize the architectural differences between FPGAs. Leonardo is a better tool to assign functions to lookup tables than the place and route tool because it has better acessesability to the users constrain and hence this makes it determine accurate area estimates and provides more information on the estimate delays.
To control LUT mapping do on leonardos command line Packing Combinational Logic Blocks can be again done by Leonardo than by Xilinx tools to get better results.It is done as a command and supported for only certain libraries. It is not supported for XC3000.
Modgen library provides a better starting point for optimization. The Xilinx Modgen libraries contains implementations of datapath elements.To select between Modgen implementations use the variable modgen_select. Timespecs generation reduces the possibility that a design will fail timing constrains during post-route simulations. Leonardo converts user constraints and VHDL attributes to equivalent timespec instantiations and timing group assignments for PPR.Timespecs will appear as part of the design in thee xnf file and constrain the router to limit timing violations after PPR. The global set/reset signal is a dedicated signal for Xilinx FPGAs.Leonardo will attempt to hook up a GSR to each flipflop in the design.GSR is an active high signal. All flipflops in the design will be initialised from the same signal with this signal being routed through the heirarchy. GSR signal is routed only through a flat design or the top level of a heirarchial design. A startup block is automatically included when GSR variable is set.
First we have to set global_sr variable to name the active high reset signal Then set the infer_gsr to automatically OR the gsr signal with the reset logic for each flip flop. Leonardo will automatically include global buffers to the clock signal and other global signals.For XC3000 family only two global buffers are avialable. Its on by default. Or issue this command. Tristate logic exsits in each CLB in the xilinx arrays.Leonardo will automatically convert bus logic to muxes removing the tristate functionality if the tristate _map is set. Set the tristate_map variable as To exclude busses from being translated if they are not always driven we use preserve_z option. All DFFS in the Xilinx CLBs have a clock enable input. Clock enable flip flops can also sometimes be mapped during optimization to do this issue the following command  This is useful if the clock enable logic crosses heirarchial boundaries. Xilinx descibes many attributes that can appear on ports or instances within the netlist. leonardo can generate these in the xnf netlist it produces.

From this point we have to use the M1 software for further analysis.