Synthesis And Fault Coverage Analysis Of A 16-Bit CPU

Ajay Ojha ( ajay.ojha@intel.com ); Nagesh Venkataramaiah ( nagesh@ims.com )
December 12, 1999

Contents

Introduction
Design description of the 16-bit CPU
Logic synthesis using Altera Flex10 FPGAs
ATPG Library Development
Scan insertion using DFTAdvisor
Test generation using FastScan and Flextest
Simulation data
Results analysis
Summary
References
Appendix-A: VHDL Code listings for the CPU modules
Appendix-B: ATPG Library listing

Introduction

Present day designs have become very complex and will continue to become more complex in the foreseeable future. Unfortunately no design is immune from manufacturing defects. If it were so there will be no need to test parts in manufacturing once functionality has been verified. The intent of manufacturing tests is to catch manufacturing process induced defects and not logical faults. Logical verification needs to happen only once during design debug and validation phase. After that manufacturing only wants to ensure that no defects introduced in manufacturing process change the functional behavior of the part thus making it non-compliant to the specification of that part. As the Si processes keep shrinking to 0.18µ and below, defects will start to become more and more difficult to catch. Many researchers in the industry have suggested that stuck@ model that has been used so successfully will no longer be sufficient and that other defect models will need to be used for test generation. As the results will show on one such model will provide adequate coverage and that a combination of models will need to be used to provide adequate fault coverage.

This project started with the goal of inserting BIST in a moderately complex 16-Bit CPU [Per'98]. However, the goals had to be reset late in the execution phase as various tool related issues prevented timely debug of the design. So, the goals were reset as follows:

  1. Synthesize complete 16-Bit design using Altera Flex10 FPGA.
  2. Develop ATPG library for Altera Flex10 FPGA primitive cells.
  3. Use DFTadvisor to insert full scan in this design.
  4. Run FastScan on the scan inserted netlist.
  5. Run FastScan on individual modules with different fault models
  6. Analyze results for fault coverage and sufficiency.

Design Description Of The 16-Bit CPU

Architectural Overview

Design chosen for this project is from [Per'98]. Reason for this was to use a moderately complex design that has both combination and sequential logic content. This design has register files used as memory array, purely combination logic blocks, an FSM and tristate bus. The original design was coded in VHDL. The CPU architecture consists of following units:

ALU: This unit performs arithmetic and logic operations.
Comp: This is a comparator unit and returns a '1' or '0'.
Control: This is the heart of the system. This unit provides all the necessary control signals and controls
                           data flow through the CPU. This unit has very few input but significantly more outputs.
Reg: This unit is used for the address register and the instruction register. The unit needs to be able to
                           capture data on rising edge of clock and pass it to output after '1' ns delay.
RegArray: The regarray entity is used to model 8-Bits x 16-Word RAM memory in this design.
Shift: This is a 16-bit shift register providing R/L shift operation. Inputs and outputs are through 16-bit i/o
                           port.
Trireg: This is a tri-statable 16-bit register component. It is both capable of capturing 16-bit data from
                    the database as well as drive data on the bus.

VHDL Code

The VHDL code of the design consists of a top level entity cpu.vhd and a library file cpulib.vhd in addition to above described entities. Following are the VHDL files for this design. Full listing is included in Appendix-A.

cpulib.vhd
          cpu.vhd
alu.vhd
          comp.vhd
control.vhd
          reg.vhd
regarray.vhd
          shift.vhd
trireg.vhd

Logic Synthesis Using Altera Flex10 FPGAs

Technology Selection Criterion

The decision to use Altera Flex10 was based on several observations. Initially several technology libraries were used for logic synthesis including Xilinx xi4, xi5 and xi7; Actel act3. However each of these libraries generated functional blocks (LUTs = Look Up Tables) that could not be modeled for ATPG and there was no ATPG library available for any of these technology synthesis libraries. One of the other technology library tried was Altera Flex10. Using Leonardo we could generate netlist down to primitive levels that could be modeled with some effort. Furthermore, as the source of the design was [Per'98] and the author had used Altera Flex10 for all the examples in the book, hence there was more likelihood of success with this library. Following table summarizes these observations and selection criterion:

Table-1: Synthesis Library vs ATPG Re-modeling Effort
Synthesis Library Complex Cells ATPG Re-modeling Effort
Xilinx xi4 LUTs Very high
Xilinx xi5 LUTs Very high
Xilinx xi7 LUTs Very high
Actel act-3 Functional blocks Very high
Altera Flex10 Flip Flops w/Set-Reset Moderate

Leonardo logic Synthesis Process

Leonardo is the latest GUI based logic synthesis tool from Mentor Graphics. Furthermore [Per'98] had used it to synthesize many examples in his book. Autologic is another, although much older,  logic synthesis tool from Mentor Graphics but not all of the components of the tool were mounted. For these reasons Leonardo was used for our logic synthesis process.

Since many iterations are required in any design process, a batch mode script as documented below was generated to synthesize our design.

load_library flex10
analyze -work work cpulib.vhd
analyze -work work alu.vhd
analyze -work work comp.vhd
analyze -work work reg.vhd
analyze -work work shift.vhd
analyze -work work control.vhd
analyze -work work regarray.vhd
analyze -work work trireg.vhd
analyze -work work cpu.vhd
elaborate cpu -work work
pre_optimize .work.cpu.rtl -common_logic -unused_logic -extract
load_modgen flex10
resolve_modgen .work.cpu.rtl -default_resolving
set tristate_map TRUE
optimize .work.cpu.rtl -target flex10 -effort standard -chip -area -pass 1
report_area cpu.area -cell_usage
report_delay -num_paths 2
write -format EDIF cpu.edif

This script generates EDIF formatted netlist that is required by DFTadvisor for Scan insertion. However, Leonardo can generate output netlist in VHDL, Verilog and Genie format as well. All the files used by this script are documented in Appendix-A.

Command Description

load_library: This command loads Altera Flex10 technology library.

analyze: This command analyzes each of the VHDL source files.

elaborate: This command links all the files to create top level cpu design file.

pre_optimize: Performs technology independent logic optimization.

load_modgen: Loads a module generator library description into Leonardo HDL database.

resolve_modgen: Resolves (fills-in) instances of arithmetic, relational and other generic functions (from the operator library)
    in the design with an implementation.

optimize: Performs technology specific load optimization and technology mapping.

report_area, report_delay: These commands report area and delay statistics.

write: This file generates final output file in various formats.

Output Netlist Generation Criterion

As described in last section, even though Leonardo is capable of generating netlists in various formats, there are portability issues with many of the formats. It is not clear if all the issues were installation related or some were specifically portability issues. Furthermore ONLY EDIF, VHDL and Verilog formatted netlists are supported by BISTa (Logic BIST insertion tool). VHDL generated netlists included netlist instances which were not supported by DFTadvisor or BISTa. If Verilog formatted netlists were used, BISTa complained about not finding some IEEE libraries even though they were NOT used by Leonardo for logic synthesis. Leonardo used Exemplar-IEEE libraries. This clearly is portability issues. We modified Verilog netlists to look for IEEE libraries to point to Exemplar distribution area but that did help. It was later determined that these libraries were not installed where BISTa was searching for them.

So, for all these reasons only EDIF formatted netlist were useful for our project. Hence the decision to only use EDIF netlists for majority of our work beyond logic synthesis.

ATPG Library Development

Overview

There were two key libraries required for this project, a technology cell library and an ATPG library. Considerable time was spent developing ATPG library as none existed for any technology library. Next section will describe some of the ATPG models in detail and how the ATPG library was generated.

The technology library is needed for logical synthesis of the design. This library contains all the physical and electrical characteristics of standard cells available in the given library. These characteristics are used to select appropriate technology library for a given design. Following is an example of a technology cell for a 2-input AND gate from Synopsys:

library (xyz) {
       cell (and2) {
      area : 5;
           pin (a1, a2) {
         direction : input;
              capacitance : 1;
      }
           pin (o1) {
        direction : output;
             function : "a1 * a2";
        timing () {
                intrinsic_rise : 0.37;
           intrinsic_fall : 0.56;
                rise_resistance : 0.1234;
           fall_resistance : 0.4567;
                related_pin : "a1 a2";
        }
           }
  }
     }

ATPG Library was needed for logic gate level netlist generation. This is the netlist that is used by test generation and fault grading tools. This library has no physical, electrical or timing information. Following is an example of such a 2-input AND gate model for Altera Flex10 library:

model and2 (in[0], in[1], out) (
cell_type = and;
input (in[0], in[1]) ()
output (out) (primitive = _and(in[0], in[1], out);)
)

Note that this model calls a primitive "_and". Model generation will be described in next section, but primitive are what test generation tools understand at the fundamental basic block level. All ATPG models have to be modeled at this level.

ATPG Model Development Process

Each cell in the synthesis library needs to have an equivalent gate level model in ATPG library. We used schematic view of the cells to generate gate level models for each of the cells used in our design. We did not model the register array as it represents memory and memories are typically tested differently not with scan. Combinational cell remodeling was rather straight forward. Below are two examples of such cells, first is a tri-state buffer and the second is a 2-input mux.

Tri-state buffer:

model tri (in, enable, out) (
input(in) ()
input(enable) ()
output(out) (primitive=_tsh(in , enable, out);)
)

2-Input MUX:

model mux2(A, B, S, O) (
    cell_type = mux S A B;
    input(A, B, S) ()
    output(O) (primitive = _mux(A, B, S, O);)
)

A critical part of library development is to have scanable versions of all the flip-flops used in a given design. We had three different types of flip flops in our design. Below is a template for scan cell model:

model model_name(list_of_pins) (
scan_definition (
type = scan_cell_type;
data_in = pin_name;
scan_in = pin_name;
scan_out = pin_name, ...;
scan_enable = pin_name;
scan_enable_inverted = pin_name;
scan_clock = pin_name;
scan_master_clock = pin_name;
scan_slave_clock = pin_name;
offstate_inverted = pin_name, ...;
tie0 = pin_name, ...;
tie1 = pin_name, ...;
usage = <input|output|hol0|hol1>;
non_scan_model = model_name(list_of_pins);
test_clock = pin_name;
test_enable = pin_name;
test_set = pin_name;
test_reset = pin_name;
set_disabled;
reset_disabled;
)
<model or macro description> . . . )

Reader should consult Mentor Graphics's "Design-For-Test Common Resource Manual" for detailed description of each of these terms. This served as the basis for our flip flop cell re-modeling. However our flip flops turned out to be more complex and so the modeling took several iterations to get it right. Following listing is for one of the most complex DFF (D-Flip-Flop) that we had to re-model for scan use.

model sdffers (set, reset, in, clk, SI, SE, ce, out ) (
scan_definition (
type = mux_scan;
data_in = in;
scan_in = SI;
scan_enable = SE;
scan_out = out;
non_scan_model = dffers(set, reset, in, clk, ce, out);
)
    input(set) ()
    input(reset) ()
    input(in, SI) ()
    input(clk) ()
    input(SE) ()
    input(ce) ()
    intern (_ND) (primitive = _mux(in, SI, SE, _ND);)
    intern (_NN2) (primitive = _or(ce, SE, _NN2);)
    intern (_NN1) (primitive = _mux(_ND, out, _NN2, _NN1);)
    output (out) (primitive = _dff(set, reset, clk, _NN1, out,);)
)

ATPG Library

Following types of cell have been re-modeled in our ATPG library. This library is not specific to our design but is more specific to Altera Flex10 FPGS standard cell synthesis library. It can be very easily used for other designs that use same cells and can be easily extended to included any number of standard cell models.

Table-2: ATPG Library Cell Description.
Cell Description Cell Description
false Logical false inbuf Buffer
true Logical true outbuf Buffer
inv Inverter vcc Vcc
or2 2-Input OR gate gnd Gnd
or3 3-Input OR gate tri Tri-State buffer
or4 4-Input OR gate dffrs DFF w/Set, Reset
and2 2-Input AND gate sdffrs Scaned version of above
and3 3-Input AND gate dffers DFF w/Set, Reset and cell enable
and4 4-Input AND gate sdffers Scanned version of above
cascade Wire mux2 2-Input MUX
soft    Wire lcell Wire

Scan Insertion Using DFTAdvisor

Overview

DFTAdvisor is used for internal scan identification and insertion. Scan design make a difficult to test sequential circuit behave like an easier to test combinational circuit. This is achieved by replacing sequential elements with scannable sequential elements (scan cells) and then stitching the scan cells together into scan chains. These chains can be used to shift data in and out when the design is in scan mode. Our design is a full-scan design and uses mux-scan cells. Automatic scan identification is used for optimum scan solution. The design goes through a series of Design Rule Checks to ensure good testability and scan insertion.

The inputs to the Dftadvisor is the synthesized hierarchical netlist generated by Leonardo and ATPG library. The ATPG library contains the gate level model descriptions for all technology library cells in this design along with the models descriptions for all the scan replacement cells. We were only able to read a netlist in EDIF format for reasons mentioned earlier in this report.  The output of the DFTAdvisor is a scan inserted design and setup files for the ATPG (FastScan and Flextest), the design can also be used for BIST insertion ( BIST architect).
   
The general design procedure is to take each module separately and go through scan inception and design rules checking. If each module can be scan inserted then we can more easily do scan insertion on the integrated design. The design consists of many modules, the register array module required a RAM like model in ATPG and had to be separated. For scan insertion we consider all the modules except the register array. The integration of this module could not be completed at present.

The directories ALU, COMP, CONTROL, REG, REGARRAY, SHIFT and TRIREG contain all the input and output data files for individual modules. The CPU_NO_REGARRAY contains the integrated module in the CPU ( without the REGARRAY).

We first discuss the method used for scan insertion of the integrated design. Then we consider a few of the modules separately and compare their testability. If the dftadvisor is launched from the working directory then all files read and written from the directory need no have explicit paths.

Scan And Test Logic Insertion Process

Set the $MGC_HOME to point mentor.C.4 and change directory to where source files are located. Typically DFTAdvisor is invoked as follows:

prompt>$MGC_HOME/bin/dftadvisor -EDIF <filename>.edif -lib <atpg lib name>

    <filename> is the name of the Leonardo generated output in edif format.
    <atpg lib name> is the ATPG library name.

For our design, out_cpu_flex10.edif using ATPG library atglib_flex10, DFTAdvisor was invoked as follows:

prompt>$MGC_HOME/bin/dftadvisor -EDIF out_cpu_flex10.edif -lib atglib_flex10

The DFTadvisor shell operates in two modes, SETUP and DFT. SETUP mode is used to setup the scan insertion process, DFT is used to actually insert the scan cells. After loading the design the shell mode is set into SETUP.

Specifying Clock Signals:
Dftadvisor must be aware of the circuit clocks to determine which sequential elements are eligible for scan. Clocks are considered to be any signals that have the ability to alter the state of a sequential device ( clocks, set, reset). At the setup prompt enter,

SETUP> add clocks 0 /clock /reset

Since the design is the integration of modules, the clock signal to the modules from the control unit is given by state machine flip flops and not connected directly to the input clock. This creates a testability problem, to insert test points to control the clock during the scan,
Dftadvisor automatically inserts test points optimally. To insert test logic we use the following command,

SETUP> set test logic -set on -reset on -clock on -tristate on

After the initial setup change mode to dft,

SETUP>set system mode dft

The dftadvisor checks for all design rule violations and scannability checks. To run the system identification

DFT> run

The scan identification statistics is displayed. To modify the design and insert scan, test points type,

DFT> insert test logic -scan on -test_point on

After scan and testpoint insertion, write the output in edif format. VHDL or Verilog formats are useful for bist insertion. The DFTAdvisor also generates dofile and test procedure files. The dofiles contain setup instructions for the ATPG's (FastScan and Flextest). The test procedure files describes the order in which the control signals should be applied for normal and scan operation.

DFT> write netlist dfta_netlist.edif -edif -replace
The output is written to dfta_netlist.edif

DFT> write atpg setup dfta_netlist -replace
The setup files are written to dfta_netlist.dofile and dfta_netlist.testproc

The program is exited

DFT>exit

The above procedure is applied to all the individual modules in their respective directories.

After scan insertion, BIST insertion can be done. Since the BIST architect tool could not be used, we could only do the test vector generation on the scan inserted design using FastScan and Flextest.

Test Generation Using FastScan And Flextest

FastScan and FlexTest are mentor graphics ATPG tools. FastScan performs full scan automatic test pattern generation (ATPG) for scan based designs. FlexTest creates test patterns for full, partial, or non-scan design. Both contain embedded fault simulators.

The ATPG tools require a structural (gate-level) design netlist and a DFT library. Every element in the netlist must have an equivalent description in the ATPG library. After invocation, the tool goes into a setup mode. Within setup mode, tasks can be run with commands, a dofile or in GUI mode. After exiting setup mode the ATPG tool creates a flattened design model and performs learning analysis and design rule checking on the model.

To generate patterns we enter the ATPG mode (In this project we have used the default fault list). The pattern generation was done on different fault types ( stuck@, Iddq, toggle & Transition ) and with different patterns sources. Generally if the test coverage is not high enough because of sequential elements, then Flextest is used for ATPG. Flextest algorithm differs from FastScan and can give a higher coverage.
Hence we use both and compare the results. The procedure for the integrated module is discussed here, the same is applied to individual modules.

FastScan

Invoking FastScan:

prompt>$MGC_HOME/bin/FastScan $WORKDIR/dfta_netlist.edif -edif -lib $WORKDIR/atglib_flex10

After invocation the program enters setup mode. At this point dofile generated by DFTAdvisor is read,

SETUP> set dofile abort on
SETUP> dofile dfta_netlist.dofile

Test Generation:

After setup change mode to ATPG, the program goes through testability checks before changing mode, specify the fault type. We have considered the stuck, Iddq, Toggle and Transition types for test generation with internal patterns (deterministic + random). The test generation is also repeated with only random patterns. The test with only random patterns give a good indication of the test coverage that can be achieved with BIST.

SETUP> set system mode atpg
ATPG>

ATPG> set pattern source internal
ATPG> set fault type stuck
ATPG> add faults -all
ATPG> run
ATPG> report stat

Change the fault type to Iddq and run the test,

ATPG> set fault type iddq
ATPG> add faults -all
ATPG> run
ATPG> report stat

Change the fault type to Toggle and run the test,

ATPG> set fault type toggle
ATPG> add faults -all
ATPG> run
ATPG> report stat

Change the fault type to Transition and run the test

ATPG> set fault type transition
ATPG> add faults -all
ATPG> run
ATPG> report stat

Change the pattern generator source to random,

ATPG> set pattern source random

Repeat the test generation for each fault type.

This procedure is repeated for individual modules. The results are discussed in the results section.


Flextest

The procedure is very similar to FastScan and is briefly mentioned here,

Invoking FlexTest:

prompt>$MGC_HOME/bin/FlexTest $WORKDIR/dfta_netlist.edif -edif -lib $WORKDIR/atglib_flex10

After invocation the program enters setup mode and DFTAdvisor generated dofile is read,

SETUP> set dofile abort on
SETUP> dofile dfta_netlist.dofile

Invoke design rule checking,

SETUP> set system mode drc
DRC>

Invoke ATPG,

DRC> set system mode atpg
ATPG> set fault type stuck
ATPG>add faults -all
ATPG> run
ATPG> report stat

The procedure is repeated for different fault types. Since Flextest can read and test a non-scan inserted circuit, we found it interesting to compare the test generation statistics before and after scan insertion for some modules. This is reported in the results section.

Simulation Data

Mentor Graphics uses following definitions:
Fault coverage = percentage of all faults that the test pattern set tests, treating untestable faults the same as undetected faults.

          #DT + (#PD * posdet_credit)
    = --------------------------------
                           #all faults
   
DT (Detected): The detected fault class includes all faults that the ATPG process identifies as detected. The detected faults contain two subclasses. The detected fault class contains two subclasses:
    det_simulation(DS) - faults detected when the tool performs fault simulation.
    det_implication(DI)- faults detected when the tool performs learning analysis.

PD (Posdet): The posdet or possible-detected, fault classes include all faults that fault simulation identifies as possible-detected but not hard detected. A possible-detected fault results in a 0-X or 1-X difference at an observation point. By default, the calculations give 50% credit for posdet faults.

Fault Coverage Data

Fault coverage data was obtained on the overall CPU unit as well as three stand alone modules.

Overall CPU

This is the complete 16-bit CPU design excluding "memory array" (modeled as register array). The memory array is a 16x8 register array and as such the design does not lend itself to traditional test applications and requires significant modification of the original architecture. Hence this unit was turned off in our simulation.

Table-3: Overall CPU Fault Coverage.
 

Fault Model

FastScan Flextest
Internal Patterns
(Deterministic + Random)
Random Patterns (Only)
Internal (w/Scan)
(Deterministic + Random)
Fault Cov.(%) Vectors Fault Cov.(%) Vectors Fault Cov.(%) Test Cycles
Stuck@ 54 225 24 14 44 159
Iddq 81 129 78 79 88 190
Toggle 97 88 93 67 99 58
Transition 38 358 13 15 38* 400
*Simulation stopped after ~ 25-30 mints.

Control Module

Following data is for Control module. It consists of FSM (Finite State Machine) of the CPU and random logic.

Table-4: Control Module Fault Coverage.
 

Fault Model

FastScan Flextest
Internal Patterns
(Deterministic + Random)
Random Patterns (Only) Internal (W/Scan) W/O Scan
Fault Cov.(%) Vectors Fault Cov.(%) Vectors Fault Cov.(%) Cycles Fault Cov.(%) Cycles
Stuck@ 94 176 75 97 99.9 163 93 686
Iddq 68 81 67 60 89 59 89 466
Toggle 99 37 97 39 100 31 99 221
Transition 81 243 62 171 87 712 87 1125

Trireg Module

Trireg module is a tri-statable register connected to main CPU bus.

Table-5: Trireg Module Fault Coverage.
 

Fault Model

FastScan Flextest
Internal Patterns
(Deterministic + Random)
Random Patterns (Only) Internal (W/Scan) W/O Scan
Fault Cov.(%) Vectors Fault Cov.(%) Vectors Fault Cov.(%) Cycles Fault Cov.(%) Cycles
Stuck@ 79 16 71 10 79 14 74 31
Iddq 56 9 56 9 73 6 72 18
Toggle 84 5 84 4 84 6 82 15
Transition 59 17 59 17 63 46 57 89

ALU Module

ALU is 16-bit arithmetic logic unit and is totally consists of combinational logic.

Table-6: ALU Module Fault Coverage.
 

Fault Model

FastScan Flextest
Internal Patterns
(Deterministic + Random)
Random Patterns (Only)
Internal (w/Scan)
(Deterministic + Random)
Fault Cov.(%) Vectors Fault Cov.(%) Vectors Fault Cov.(%) Test Cycles
Stuck@ 100 190 95 167 100 195
Iddq 100 76 98 55 100 77
Toggle 100 50 98 40 100 61
Transition 100 402 90 278 100 506

Results Analysis

Coverage Analysis

Data clearly indicates that as the sequential content of a module increases, fault coverage starts to decrease. Complexity of the module also affects fault coverage and number of vectors. Trireg is the smallest module and so higher coverage was expected, however due to high sequential content and tri-state bus contention is suspected to be responsible for this low coverage.

CPU coverage we know was low due to bus contention removing many of the eligible patterns.

Iddq fault coverage varies with modules. In this test the idea is to drive adjacent nodes in opposite direction so as to create a path from Vcc to ground and this causes higher current consumption indicating presence of defect(s). However as all modules except ALU contain varied degrees of sequential logic AND for Iddq testing the simulator needs to have very good control of nodes, ALU is the only module with highest coverage.

Transition fault coverage suffers from same limitations as Iddq and also requires more vectors in general.

Toggle is not a test for defects but an indication of how much of the circuitry can be toggled. In industry this is used for screening out infant mortal parts during manufacturing Burn-In process. Burn-In is a process that accelerates certain mechanisms and allows screening of parts that will fail during very early stages of it's life.

The data for the random tests give fairly good idea of fault coverage if BIST was inserted into the circuit.

Another observation was that sequential logic requires more deterministic pattern as opposed to random logic due to the need for past values for sequential logic testing.

Flextest vs FastScan Coverage Impact

An interesting observation was that for individual modules ( and perhaps for CPU as well if bus contention was not there) Flextest has equal OR better coverage then FastScan. However more vectors are needed to achieve this coverage, perhaps due to the presence of sequential elements and forward and back tracking.

Flextest can also reduce number of vectors based on amount of scan present.

Summary

The initial goals of the project were to insert full scan, BIST, understand DFT issues and use of Mentor Graphics's DFT tools. However one of the objectives had to be reset (i.e. BIST) due BISTa tool issues and absence of ATPG cell library. We thus devoted significant effort in generating an ATPG library for Altera Flex10 FPGA technology. We also then compared results of fault coverage on various modules of our project using both FastScan and Flextest ATPG tools. Even though we had to disable "memory array" (modeled as register array in our design), we do not believe that this detracted from our goal. As in practice, memory test methodology is completely different and was never our objective to pursue. Future enhancement for the project will be to enhance ATPG library further and insert BIST in the overall CPU design. This will involve modeling of the "memory array" as either a RAM (See Mentor Graphics's "Design-For-Test Common Test Process Manual") or black box it. 

References

[Abr'90]    Miron Abramovici, Melvin A. Breuer and Arthur Friedman, Digital Systems Testing And Testable
       Design, IEEE Press, 1990.
[Bar'87]    Paul H. Bardell, William H. McAnney and Jacob Savir, Built-In Test for VLSI: Pseudorandom
                 Techniques, John Wiley & Sons, 1987.
[Cro'99]    Alfred L. Coruch, Design For Test For Digital IC's And Embedded Core Systems, Prentice-Hall 1999.
[Kho'78]   Zvi Kohavi, Switching And Finite Automata Theory, McGraw-Hill, 1978.
[Lan'98]    Christian Landrault (Translated by Prof. Marek Perkowski of PSU), Test And Design For Test, 1998.
[Per'98]     Douglas Perry, VHDL, McGraw-Hill, 1998.
[Ska'96]    Kevin Skahill, VHDL For Programmable Logic, Addison-Wesley, 1996.
 

Appendix-A: VHDL Code Listing For The CPU Modules.

cpulib.vhd

library IEEE;
use IEEE.std_logic_1164.all;
use IEEE.std_logic_arith.all;
package cpu_lib is
  type t_shift is (shftpass, shl, shr, rotl, rotr);
  subtype t_alu is unsigned(3 downto 0);
  constant alupass : unsigned(3 downto 0) := "0000";
  constant andOp : unsigned(3 downto 0) := "0001";
  constant orOp : unsigned(3 downto 0) := "0010";
  constant notOp : unsigned(3 downto 0) := "0011";
  constant xorOp : unsigned(3 downto 0) := "0100";
  constant plus : unsigned(3 downto 0) := "0101";
  constant alusub : unsigned(3 downto 0) := "0110";
  constant inc : unsigned(3 downto 0) := "0111";
  constant dec : unsigned(3 downto 0) := "1000";
  constant zero : unsigned(3 downto 0) := "1001";
  type t_comp is (eq, neq, gt, gte, lt, lte);
  subtype t_reg is std_logic_vector(2 downto 0);
  type state is (reset1, reset2, reset3, reset4, reset5, reset6, execute,
                 nop, load, store, move, load2, load3, load4, store2, store3, 
                 store4, move2, move3, move4,incPc, 
                 incPc2, incPc3, incPc4, incPc5, incPc6, loadPc, loadPc2, 
                 loadPc3, loadPc4, bgtI2, bgtI3, bgtI4, bgtI5, bgtI6, bgtI7,
                 bgtI8, bgtI9, bgtI10, braI2, braI3, braI4, braI5, braI6, loadI2,
                 loadI3, loadI4, loadI5, loadI6, inc2, inc3, inc4);
  subtype bit16 is std_logic_vector(15 downto 0);
end cpu_lib;

cpu.vhd

library IEEE;
use IEEE.std_logic_1164.all;
use work.cpu_lib.all;
entity cpu is
  port(clock, reset, ready : in std_logic;
       addr : out bit16;
       rw, vma : out std_logic;
       data : inout bit16);
end cpu;
architecture rtl of cpu is
  component regarray
    port( data : in bit16;
          sel : in t_reg;
          en : in std_logic;
          clk : in std_logic;
          q : out bit16);
  end component;
  component reg
    port( a : in bit16;
          clk : in std_logic;
          q : out bit16);
  end component;
  component trireg
    port( a : in bit16;
          en : in std_logic;
          clk : in std_logic;
          q : out bit16);
  end component;
  component control
    port( clock : in std_logic;
          reset : in std_logic;
          instrReg : in bit16;
          compout : in std_logic;
          ready : in std_logic;
          progCntrWr : out std_logic;
          progCntrRd : out std_logic;
          addrRegWr : out std_logic;
          outRegWr : out std_logic;
          outRegRd : out std_logic;
          shiftSel : out t_shift;
          aluSel : out t_alu;
          compSel : out t_comp;
          opRegRd : out std_logic;
          opRegWr : out std_logic;
          instrWr : out std_logic;
          regSel : out t_reg;
          regRd : out std_logic;
          regWr : out std_logic;
          rw : out std_logic;
          vma : out std_logic
        );
  end component;
  component alu
    port( a, b : in bit16;
          sel : in t_alu;
          c : out bit16);
  end component;
  component shift
    port ( a : in bit16;
           sel : in t_shift;
           y : out bit16);
  end component;
  component comp
    port( a, b : in bit16;
          sel : in t_comp;
          compout : out std_logic);
  end component;
  signal  opdata, aluout, shiftout, instrregOut : bit16;
  signal regsel : t_reg;
  signal regRd, regWr, opregRd, opregWr, outregRd, outregWr, 
         addrregWr, instrregWr, progcntrRd, progcntrWr, compout : std_logic;
  signal alusel : t_alu;
  signal shiftsel : t_shift;
  signal compsel : t_comp;
begin
  ra1 : regarray port map(data, regsel, regRd, regWr, data);
  opreg: trireg port map (data, opregRd, opregWr, opdata);
  alu1: alu port map (data, opdata, alusel, aluout);
  shift1: shift port map (aluout, shiftsel, shiftout);
  outreg: trireg port map (shiftout, outregRd, outregWr, data);
  addrreg: reg port map (data, addrregWr, addr);
  progcntr: trireg port map (data, progcntrRd, progcntrWr, data);
  comp1: comp port map (opdata, data, compsel, compout);
  instr1: reg port map (data, instrregWr, instrregOut);
  con1: control port map (clock, reset, instrregOut, compout, 
        ready, progcntrWr, progcntrRd, addrregWr, outregWr, outregRd,
        shiftsel, alusel, compsel, opregRd, opregWr, instrregWr,
        regsel, regRd, regWr, rw, vma);
end rtl;

alu.vhd

library IEEE;
use IEEE.std_logic_1164.all;
use IEEE.std_logic_unsigned.all;
use work.cpu_lib.all;
entity alu is
  port( a, b : in bit16;
        sel : in t_alu;
        c : out bit16);
end alu;
architecture rtl of alu is
begin
  aluproc: process(a, b, sel)
  begin
  case sel is
    when alupass =>
      c <= a after 1 ns;
    when andOp =>
      c <= a and b after 1 ns;
    when orOp =>
      c <= a or b after 1 ns;
    when xorOp =>
      c <= a xor b after 1 ns;
    when notOp =>
      c <= not a after 1 ns;
    when plus => 
      c <= a + b after 1 ns;
    when alusub =>
      c <= a - b after 1 ns;
    when inc =>
      c <= a +  "0000000000000001" after 1 ns;
    when dec =>
      c <= a - "0000000000000001" after 1 ns;
    when zero =>
      c <= "0000000000000000" after 1 ns;
    when others =>
      c <= "0000000000000000" after 1 ns;
  end case;
  end process;
end rtl;

comp.vhd

library IEEE;
use IEEE.std_logic_1164.all;
use IEEE.std_logic_arith.all;
use work.cpu_lib.all;
--use work.cpu_math.all;
entity comp is
  port( a, b : in bit16;
        sel : in t_comp;
        compout : out std_logic);
end comp;
architecture rtl of comp is
begin
  compproc: process(a, b, sel)
  begin
    case sel is
      when eq =>
        if a = b then 
          compout <= '1' after 1 ns;
        else
          compout <= '0' after 1 ns;
        end if;
      when neq =>
        if a /= b then 
          compout <= '1' after 1 ns;
        else
          compout <= '0' after 1 ns;
        end if;
      when gt =>
        if a > b then 
          compout <= '1' after 1 ns;
        else
          compout <= '0' after 1 ns;
        end if;
      when gte =>
        if a >= b then 
          compout <= '1' after 1 ns;
        else
          compout <= '0' after 1 ns;
        end if;
      when lt =>
        if a < b then 
          compout <= '1' after 1 ns;
        else
          compout <= '0' after 1 ns;
        end if;
      when lte =>
        if a <= b then 
          compout <= '1' after 1 ns;
        else
          compout <= '0' after 1 ns;
        end if;
    end case;
  end process;
end rtl;

control.vhd

library IEEE;
use IEEE.std_logic_1164.all;
use work.cpu_lib.all;
entity control is
  port( clock : in std_logic;
           reset : in std_logic;
           instrReg : in bit16;
           compout : in std_logic;
           ready : in std_logic;
           progCntrWr : out std_logic;
           progCntrRd : out std_logic;
           addrRegWr : out std_logic;
           addrRegRd : out std_logic;
           outRegWr : out std_logic;
           outRegRd : out std_logic;
           shiftSel : out t_shift;
           aluSel : out t_alu;
           compSel : out t_comp;
           opRegRd : out std_logic;
           opRegWr : out std_logic;
           instrWr : out std_logic;
           regSel : out t_reg;
           regRd : out std_logic;
           regWr : out std_logic;
           rw : out std_logic;
           vma : out std_logic
        );
end control;
architecture rtl of control is
  signal current_state, next_state : state;
begin
  nxtstateproc: process( current_state, instrReg, compout, ready)
  begin
    progCntrWr <= '0';
    progCntrRd <= '0';
    addrRegWr <= '0';
    outRegWr <= '0';
    outRegRd <= '0';
    shiftSel <= shftpass;
    aluSel <= alupass;
    compSel <= eq;
    opRegRd <= '0';
    opRegWr <= '0';
    instrWr <= '0';
    regSel <= "000";
    regRd <= '0';
    regWr <= '0';
    rw <= '0';
    vma <= '0';
    case current_state is 
      when reset1 =>
         aluSel <= zero after 1 ns;
         shiftSel <= shftpass;
         next_state <= reset2;
      when reset2 =>
        aluSel <= zero;
        shiftSel <= shftpass;
        outRegWr <= '1';
        next_state <= reset3;
      when reset3 => 
        outRegRd <= '1';
        next_state <= reset4;
      when reset4 =>
         outRegRd <= '1';
         progCntrWr <= '1';
         addrRegWr <= '1';
         next_state <= reset5;
      when reset5 =>
        vma <= '1';
        rw <= '0';
        next_state <= reset6;
      when reset6 =>
         vma <= '1';
         rw <= '0';
         if ready = '1' then 
           instrWr <= '1';
           next_state <= execute;
         else
           next_state <= reset6;
         end if;
      when execute =>
        case instrReg(15 downto 11) is
          when "00000" =>                 --- nop
            next_state <= incPc;
          when "00001" =>               --- load
            regSel <= instrReg(5 downto 3);
            regRd <= '1';
            next_state <= load2;
 
          when "00010" =>               --- store
            regSel <= instrReg(2 downto 0);
            regRd <= '1';
            next_state <= store2;
          when "00011" =>            ----- move
            regSel <= instrReg(5 downto 3);
            regRd <= '1';
            aluSel <= alupass;
            shiftSel <= shftpass;
            next_state <= move2;
          when "00100" =>          ---- loadI
            progcntrRd <= '1';
            alusel <= inc;
            shiftsel <= shftpass;
            next_state <= loadI2;
          when "00101" =>          ---- BranchImm
            progcntrRd <= '1';
            alusel <= inc;
            shiftsel <= shftpass;
            next_state <= braI2;
          when "00110" =>          ---- BranchGTImm
            regSel <= instrReg(5 downto 3);
            regRd <= '1';
            next_state <= bgtI2;
          when "00111" =>         ------- inc
            regSel <= instrReg(2 downto 0);
            regRd <= '1';
            alusel <= inc;
            shiftsel <= shftpass;
            next_state <= inc2;
            
          when others =>
            next_state <= incPc;
        end case;
      when load2 =>
         regSel <= instrReg(5 downto 3);
         regRd <= '1';
         addrregWr <= '1';
         next_state <= load3;
      when load3 =>
        vma <= '1';
        rw <= '0';
        next_state <= load4;
      when load4 =>
        vma <= '1';
        rw <= '0';
        regSel <= instrReg(2 downto 0);
        regWr <= '1';
        next_state <= incPc;
      when store2 =>
        regSel <= instrReg(2 downto 0);
        regRd <= '1';
        addrregWr <= '1';
        next_state <= store3;
      when store3 =>
        regSel <= instrReg(5 downto 3);
        regRd <= '1';
        next_state <= store4;
      when store4 =>
        regSel <= instrReg(5 downto 3);
        regRd <= '1';
        vma <= '1';
        rw <= '1';
        next_state <= incPc;
      when move2 =>
        regSel <= instrReg(5 downto 3);
        regRd <= '1';
        aluSel <= alupass;
        shiftsel <= shftpass;
        outRegWr <= '1';
        next_state <= move3;
      when move3 =>
        outRegRd <= '1';
        next_state <= move4;
     when move4 =>
       outRegRd <= '1';
       regSel <= instrReg(2 downto 0);
       regWr <= '1';
       next_state <= incPc;
      when loadI2 =>
        progcntrRd <= '1';
        alusel <= inc;
        shiftsel <= shftpass;
        outregWr <= '1';
        next_state <= loadI3;
      when loadI3 =>
        outregRd <= '1';
        next_state <= loadI4;
      when loadI4 =>
        outregRd <= '1';
        progcntrWr <= '1';
        addrregWr <= '1';
        next_state <= loadI5;
      when loadI5 =>
        vma <= '1';
        rw <= '0';
        next_state <= loadI6;
      when loadI6 =>
        vma <= '1';
        rw <= '0';
        if ready = '1' then 
          regSel <= instrReg(2 downto 0);
          regWr <= '1';
          next_state <= incPc;
        else
          next_state <= loadI6;
        end if;
      when braI2 =>
        progcntrRd <= '1';
        alusel <= inc;
        shiftsel <= shftpass;
        outregWr <= '1';
        next_state <= braI3;
      when braI3 =>
        outregRd <= '1';
        next_state <= braI4;
      when braI4 =>
        outregRd <= '1';
        progcntrWr <= '1';
        addrregWr <= '1';
        next_state <= braI5;
      when braI5 =>
        vma <= '1';
        rw <= '0';
        next_state <= braI6;
      when braI6 =>
        vma <= '1';
        rw <= '0';
        if ready = '1' then 
          progcntrWr <= '1';
          next_state <= loadPc;
        else
          next_state <= braI6;
        end if;
      when bgtI2 =>
        regSel <= instrReg(5 downto 3);
        regRd <= '1';
        opRegWr <= '1';
        next_state <= bgtI3;
      when bgtI3 =>
        opRegRd <= '1';
        regSel <= instrReg(2 downto 0);
        regRd <= '1';
        compsel <= gt;
        next_state <= bgtI4;
      when bgtI4 =>
        opRegRd <= '1' after 1 ns;
        regSel <= instrReg(2 downto 0);
        regRd <= '1';
        compsel <= gt;
        if compout = '1' then 
          next_state <= bgtI5;
        else 
          next_state <= incPc;
        end if;
      when bgtI5 =>
        progcntrRd <= '1';
        alusel <= inc;
        shiftSel <= shftpass;
        next_state <= bgtI6;
      when bgtI6 =>
        progcntrRd <= '1';
        alusel <= inc;
        shiftsel <= shftpass;
        outregWr <= '1';
        next_state <= bgtI7;
      when bgtI7 =>
        outregRd <= '1';
        next_state <= bgtI8;
      when bgtI8 =>
        outregRd <= '1';
        progcntrWr <= '1';
        addrregWr <= '1';
        next_state <= bgtI9;
      when bgtI9 =>
        vma <= '1';
        rw <= '0';
        next_state <= bgtI10;
      when bgtI10 =>
        vma <= '1';
        rw <= '0';
        if ready = '1' then 
          progcntrWr <= '1';
          next_state <= loadPc;
        else 
          next_state <= bgtI10;
        end if;
      when inc2 =>
        regSel <= instrReg(2 downto 0);
        regRd <= '1';
        alusel <= inc;
        shiftsel <= shftpass;
        outregWr <= '1';
        next_state <= inc3;
      when inc3 =>
        outregRd <= '1';
        next_state <= inc4;
     when inc4 =>
        outregRd <= '1';
        regsel <= instrReg(2 downto 0);
        regWr <= '1';
        next_state <= incPc;
      when loadPc =>
        progcntrRd <= '1';
        next_state <= loadPc2;
      when loadPc2 =>
        progcntrRd <= '1';
        addrRegWr <= '1';
        next_state <= loadPc3;
      when loadPc3 =>
        vma <= '1';
        rw <= '0';
        next_state <= loadPc4;
 
      when loadPc4 =>
        vma <= '1';
        rw <= '0';
        if ready = '1' then 
          instrWr <= '1';
          next_state <= execute;
        else
          next_state <= loadPc4;
        end if;
      when incPc =>
        progcntrRd <= '1';
        alusel <= inc;
        shiftsel <= shftpass;
        next_state <= incPc2;
      when incPc2 =>
        progcntrRd <= '1';
        alusel <= inc;
        shiftsel <= shftpass;
        outregWr <= '1';
        next_state <= incPc3;
      when incPc3 =>
        outregRd <= '1';
        next_state <= incPc4;
      when incPc4 =>
        outregRd <= '1';
        progcntrWr <= '1';
        addrregWr <= '1';
        next_state <= incPc5;
      when incPc5 =>
        vma <= '1';
        rw <= '0';
        next_state <= incPc6;
      when incPc6 =>
        vma <= '1';
        rw <= '0';
        if ready = '1' then 
          instrWr <= '1';
          next_state <= execute;
        else 
          next_state <= incPc6;
        end if;
      when others => 
        next_state <= incPc;
   end case;
end process;
  controlffProc: process(clock, reset)
  begin
    if reset = '1' then 
      current_state <= reset1 after 1 ns;
    elsif clock'event and clock = '1' then 
      current_state <= next_state after 1 ns;
    end if;
  end process;
end rtl;

reg.vhd

library IEEE;
use IEEE.std_logic_1164.all;
use work.cpu_lib.all;
entity reg is
  port( a : in bit16;
        clk : in std_logic;
        q : out bit16);
end reg;
architecture rtl of reg is 
begin
  regproc: process
  begin
    wait until clk'event and clk = '1';
    q <= a after 1 ns;
  end process;
end rtl;

regarray.vhd

library IEEE;
use IEEE.std_logic_1164.all;
use IEEE.std_logic_unsigned.all;
use work.cpu_lib.all;
entity regarray is
  port( data : in bit16;
        sel : in t_reg;
        en : in std_logic;
        clk : in std_logic;
        q : out bit16);
end regarray;
architecture rtl of regarray is 
  type t_ram is array (0 to 7) of bit16;
  signal temp_data : bit16;
begin
  process(clk,sel)
    variable ramdata : t_ram;
  begin
    if clk'event and clk = '1' then
      ramdata(conv_integer(sel)) := data;
    end if;
    temp_data <= ramdata(conv_integer(sel)) after 1 ns;
  end process;
  process(en, temp_data) 
  begin
    if en = '1' then 
      q <= temp_data after 1 ns;
    else
      q <= "ZZZZZZZZZZZZZZZZ" after 1 ns;
    end if;
  end process;
end rtl;

shift.vhd

library IEEE;
use IEEE.std_logic_1164.all;
use work.cpu_lib.all;
entity shift is
  port ( a : in bit16;
         sel : in t_shift;
         y : out bit16);
end shift;
architecture rtl of shift is
begin
  shftproc: process(a, sel)
  begin
    case sel is
      when shftpass =>
        y <= a after 1 ns;
      when shl =>
        y <= a(14 downto 0) & '0' after 1 ns;
      when shr =>
        y <= '0' & a(15 downto 1) after 1 ns;
      when rotl =>
        y <= a(14 downto 0) & a(15) after 1 ns;
      when rotr =>
        y <= a(0) & a(15 downto 1) after 1 ns;
    end case;
  end process;
end rtl;

trireg.vhd

library IEEE;
use IEEE.std_logic_1164.all;
use work.cpu_lib.all;

entity trireg is
  port( a : in bit16;
        en : in std_logic;
        clk : in std_logic;
        q : out bit16);
end trireg;
architecture rtl of trireg is 
  signal val : bit16;
begin
  triregdata: process
  begin
    wait until clk'event and clk = '1';
    val <= a;
  end process;
  trireg3st: process(en, val)
  begin
    if en = '1' then
      q <= val after 1 ns;
    elsif en = '0' then 
      q <= "ZZZZZZZZZZZZZZZZ" after 1 ns;
-- exemplar_translate_off
    else
      q <= "XXXXXXXXXXXXXXXX" after 1 ns;
-- exemplar_translate_on
    end if;
  end process;
end rtl;

Appendix-B: ATPG Library Listing

// Developers: Ajay Ojha ( ajay.ojha@intel.com ), Nagesh Venkataramiah ( nagesh@ims.com)
// Sub: Altera Flex10 ATPG Library
// Date: Nov. 1999
// $Log$
//

model false (out) (
output(out) (
fault = none; primitive =_tie0(out);)
)

model true (out) (
output (out) (
fault = none; primitive = _tie1(out);
)

model inv (in, out) (
cell_type = inv;
input (in) ()
output (out) (primitive = _inv(in, out);)
)

model or2 (in[0], in[1], out) (
cell_type = or;
input (in[0], in[1]) ()
output (out) (primitive = _or(in[0], in[1], out);)
)

model or3 (IN1, IN2, IN3, Y) (
input (IN1, IN2, IN3) ()
intern (_NN1) (primitive = _or(IN2, IN3, _NN1);)
output (Y) (primitive = _or(IN1, _NN1, Y);)
)

model or4 (IN1, IN2, IN3, IN4, Y) (
input (IN1, IN2, IN3, IN4) ()
intern (_NN1) (primitive = _or(IN3, IN4, _NN1);)
intern (_NN2) (primitive = _or(IN2, _NN1, _NN2);)
output (Y) (primitive = _or(IN1, _NN2, Y);)
)

model and2 (in[0], in[1], out) (
cell_type = and;
input (in[0], in[1]) ()
output (out) (primitive = _and(in[0], in[1], out);)
)

model and3 (IN1, IN2, IN3, Y) (
input (IN1, IN2, IN3) ()
intern (_NN1) (primitive = _and(IN2, IN3, _NN1);)
output (Y) (primitive = _and(IN1, _NN1, Y);)
)

model and4 (IN1, IN2, IN3, IN4, Y) (
input (IN1, IN2, IN3, IN4) ()
intern (_NN1) (primitive = _and(IN1, IN2, _NN1);)
intern (_NN2) (primitive = _and(_NN1, IN3, _NN2);)
output (Y) (primitive = _and(_NN2, IN4, Y);)
)

model cascade (IN1, Y) (
input (IN1) ()
output (Y) (function = IN1 ;)
)

alias soft cascade
alias lcell cascade

model inbuf (IN, OUT) (
input (IN) ()
output (OUT) (primitive = _buf(IN, OUT);)
)

alias outbuf inbuf

model vcc (Y) (
output (Y) (
fault = none; primitive = _tie1(Y);
)
)

model gnd (Y) (
output (Y) (
fault = none; primitive = _tie0(Y);
)
)

model tri (in, enable, out) (
input(in) ()
input(enable) ()
output(out) (primitive=_tsh(in , enable, out);)
)

model dffrs (set, reset, in, clk, out ) (
input(set) ()
input(reset) ()
input(in) ()
input(clk) ()
output(out) (primitive=_dff(set, reset, clk, in, out,);)
)

model sdffrs (set, reset, in, clk, SI, SE, out ) (
scan_definition (
type = mux_scan;
data_in = in;
scan_in = SI;
scan_enable = SE;
scan_out = out;
non_scan_model = dffrs(set, reset, in, clk, out);
)
input(set) ()
input(reset) ()
input(SE) ()
input(in, SI) ()
input(clk) ()
intern (_ND) (primitive = _mux(in, SI, SE, _ND);)
output(out) (primitive=_dff(set, reset, clk, _ND, out,);)
)

model dffers (set, reset, in, clk, ce, out ) (
input(set) ()
input(reset) ()
input(in) ()
input(clk) ()
input(ce) ()
intern (_NN1) (primitive = _mux(in, out, ce, _NN1);)
output (out) (primitive = _dff(set, reset, clk, _NN1, out,);)
)

model sdffers (set, reset, in, clk, SI, SE, ce, out ) (
scan_definition (
type = mux_scan;
data_in = in;
scan_in = SI;
scan_enable = SE;
scan_out = out;
non_scan_model = dffers(set, reset, in, clk, ce, out);
)
input(set) ()
input(reset) ()
input(in, SI) ()
input(clk) ()
input(SE) ()
input(ce) ()
intern (_ND) (primitive = _mux(in, SI, SE, _ND);)
intern (_NN2) (primitive = _or(ce, SE, _NN2);)
intern (_NN1) (primitive = _mux(_ND, out, _NN2, _NN1);)
output (out) (primitive = _dff(set, reset, clk, _NN1, out,);)
)

model mux2(A, B, S, O) (
cell_type = mux S A B;
input(A, B, S) ()
output(O) (primitive = _mux(A, B, S, O);)
)