Tuesday, June 29, 2010

The future of computing? The SpiNNaker million processor computer

The future of computing is a very big claim. However if anything the future of computing does not lie in a common time, unified shared memory system - this pretty well describes every multicore computer made at the moment. The trouble is once you go to more than a few thousand cores the shared memory - shared time concept falls to bits. It just doesn't scale. Where then should we look for a new model other than biology? The brains of living creatures are capable of phenomenal processing power and yet have almost none of the features of the computers we build today. Don't you find this odd?

Lets look at how brains do it:

  • Does each neuron have a sense of the passage of time? Not in the traditional sense and maybe not at all. If it does it will probably only be only vague ordering.

  • Does a neuron share a memory space with every other neuron? No. The closest concept is an area effect of diffusion of chemicals which affect neighbours (which may not be synaptically connected) but that is very far from a shared memory concept.

  • Is each neuron connected to every other neuron? They have many connections to close neurons and some connections to neurons further away best described by a statistical distribution but otherwise, again, no.

So clearly modern computing bears as relation to a brain as sudoku does to quantum mechanics. The closest we get are the datacenters of a company such as Google but they enjoy far too much connectivity to be a good model.

Enter the world of academia because no commercial company would be stupid enough to commit the resources we can into a project like this (and all the ones before it which make it possible) - it is unlike anything that has come before and it is risky (i.e. it may not work!).

Project SpiNNaker

This revolutionary idea is nothing less than a plan to put together a computer of approximately 1 million cores with no common clock or shared memory and which can route messages with a model that approximates a neurological system.

Born in the University of Manchester's Advanced Processor Technologies Group the SpiNNaker project has taken shape in collaboration with the University of Southampton's School of Electronics and Computer Science (where I work), the Engineering and Physical Science Research Council and two enlightened commercial companies: ARM (for the processor IP) and SilĂ­stix (for their Network on Chip expertise).

SpiNNaker is a child of many parents, each a vital step on the path to it's genesis:
  1. AsipIDE GALS Design and Co-Simulation Framework - A hardware/software co-design and debugging framwork
  2. Transactional Memory - A new locking strategy which massively simplifies the ability to lock data to pass information between processes
  3. TERAFLUX: Exploiting Dataflow Parallelism in Teradevice Computing - A pan-european project looking at computers with massive numbers of cores
  4. The Balsa Asynchronous Synthesis System - A language and compiler targetting asynchronous, handshake driven logic design
The main centrepiece of the SpiNNaker project is a special System on Chip, the SpiNNaker SoC. Inside are 18 processors asynchronously connected via a blindingly fast network-on-chip and communicating with the other SoCs via hundreds of megabit links to the other SpiNNaker SoCs. With these SoCs it will take only 56,000 chips to reach the target of 1 million. The cores are ARM968 series processors capable of significant independent computation while communicating with their neighbours or the 1GBbit DDR SDRAM available to each SoC. Even within a SoC the cores do not share a common clock and communicate by passing messages which are routed to each other or outside to another SpiNNaker SoC.

A seriously ambitious project you say? Absolutely right. Vapourware or a dream? Hell no! Feast your eyes on this:




This is the first generation of SpiNNaker SoCs on a test board - there are four dual processor SoCs on this board each next to its accompanying 1Gbit ram chip (click on the picture for a very high resolution version). This board already holds 8 processors asynchronously interconnected with all the necessary debug hardware to perfect the design of the next generation.

Who ever said British Science was dull?

BIG DISCLAIMER: I am not one of the great minds trying to change the world with this project. At best I have helped a couple of people around the edges. I am a very big fan, however!

Thursday, April 22, 2010

A fixed OpenSPARC T2 build for Design Compiler 2009....

Just a very quick note to say that we have fixed the compile problems for the OpenSPARC T1 processor when building it with modern versions of Design Compiler (any > 2007).

More information, including sizes and speeds on the Synopsys 90nm EDK to follow. Bear in mind it is missing the PLL and a couple of other small modules.

We will, of course, push this upstream.

Wednesday, March 31, 2010

Current activity with the EVE ZeBu Hardware/Software Co-Verification Environment

The title is a fancy way of referring to the EVE ZeBu accelerators (much more information on their website). I previously posted on our acquisition of a UF-2 (I should mention we liked it so much we now have 2!) so I am taking a moment to show what we are up to with this wonderful technology. I have broken it into research and teaching topics.

Research
Behavioural Simulation and Synthesis of Biological Neuron Systems using VHDL
The investigation of neuron structures is an incredibly difficult and complex task that yields relatively low rewards in terms of information from biological forms (either animals or tissue). The structures and connectivity of even the simplest invertebrates are almost impossible to establish with standard laboratory techniques, and even when this is possible it is generally time consuming, complex and expensive. Recent work has shown how a simplified behavioural approach to modelling neurons can allow “virtual” experiments to be carried out that map the behaviour of a simulated structure onto a hypothetical biological one, with correlation of behaviour rather than underlying connectivity. The problems with such approaches are numerous. The first is the difficulty of simulating realistic aggregates efficiently, the second is making sense of the results and finally, it would be helpful to have an implementation that could be synthesised to hardware for acceleration. In this paper we present a VHDL implementation of Neuron models that allow large aggregates to be simulated. The models are demonstrated using a synthesizable system level VHDL model of the C. Elegans locomotory system.
The role of the EVE in this specific project is verifying and executing the largest of the neural net models functionality using your cosimulation replacing previous, limited, technology based around a FPGA board using a probe program.
Publications
Bailey, J., Wilson, P., Brown, A. and Chad, J. (2008) Behavioural Simulation and Synthesis of Biological Neuron Systems using VHDL. In: BMAS. (In Press)
Bailey, J., Wilson, P. R., Brown, A. D. and Chad, J. (2007) Behavioural Simulation of Biological Neuron Systems using VHDL and VHDL-AMS. In: IEEE Behavioural Modeling and Simulation, Sep 2007, San Jose, USA. pp. 153-158.

Architectures for Numerical Computation

Since the 1960s, the observation that has become known as Moore’s Law has become a self-fulfilling prophecy. Processing power doubles every two years because of the advances in CMOS technology. There are clear signs, however, that these technological advances are coming to an end. The eco- nomics of pushing CMOS technology to its physical limits will eventually halt further development.
If it is no longer feasible to increase computing power through smaller, faster transistors, the al- ternative is massive parallelism. This progression is already apparent. Multi-core and multi-threaded processors are now common. Although modern operating systems are able to use multiple cores, with few exceptions, programs are confined to single cores. The challenge facing software engineers is to make best use of multiple cores.
A significant amount of processing power is concerned with numerical computation. Consumer applications, such as image and audio processing are fundamentally numerical. Similarly, engineering applications, such as simulation and optimization rely on numerical calculations. At this point, we should distinguish between consumer and desktop applications and High-Performance Computing (HPC) tasks that rely on clusters of dedicated processors. It is not our intention to move into the HPC world at this time.
While using multiple cores can accelerate many numerical algorithms, far greater speed-up would be possible using more specialized forms of hardware, such as GPUs and FPGAs. A further consideration is power consumption (and the related problem of heat dissipation). Custom hardware can reduce power consumption by an order of magnitude or more. The key, of course, is to use the resources in the best possible way. In the context of the work proposed here, there are two aspects to this problem. First, we need to make the best division between hardware and software and second, we need to design an appropriate overall architecture.
The obvious role of the EVE platform in this research field is to support the research into specific computing pipelines, fine grain computation blocks and architectures as well as enabling the development of some 16 lane PCIe computation accelerators.

Teaching
Hosting a Complex SoC on the EVE Platform
This project was a proof of concept and de-risking of the EVE transactor flow using a large SoC. The SoC in question was chosen to be the Gaisler-Aeroflex LEON3 (http://www.gaisler.com). The LEON3 SoC is based around a SPARC v8 compatible CPU and is written in VHDL. The minimal LEON3 SoC was built using the EVE support for memory to model the processor cache along with the transactors for DRAM and UARTs. This work will be extended to include the VGA, Ethernet and USB transactors on the hardware side it and include support for the Snapgear Linux version for the LEON3.

Verification of a highly integrated ASIC
A large masters level project framework will produce a heterogeneous multicore ASIC to perform processing on HD video data streams. It will bundle an 32 bit microcontroller core, on chip SRAM, our custom geometric processor along with a multilayer AMBA bus architecture optimised for power and contention. The EVE will be an invaluable support to the simulation and verification of the final design before it is sent for manufacture.

So there you have it - quite a lot going on, all of which is really fascinating and fun!

Wednesday, March 17, 2010

EWME2010 and Synopsys' EDK

Firstly EWME 2010 is coming in May. It is the European Workshop on Microelectronic Education which along with its sister conference in the USA, Microelectronic Systems Education (MSE), is pretty much unique in the world.

If you need help or advice on starting a programme at your institution then this is a great opportunity to meet a very cohesive, dedicated group of educators.

EWME and MSE are characterised by fair and honest discussions on the challenges and tools available to educators in this sphere, along with a showcase of our best student projects and modules. Papers detail how to run the course administratively as well as the practical aspects. We also get a few vendors who are always pleased to get you into contact with the appropriate University Programmes.

Secondly the Synopsys 90nm EDK and our experiences with it. We have been users of this EDK for about a year now and it forms a great basis for working with our students. Partially because of the established notes and curriculum resources make it very quick to get off the ground with several complex areas. Compared to the IDESA Advanced Digital Physical Implementation Flow course which promotes the TSMC 90nm deisgn kit this one is far more student friendly with a lower startup cost (it also doesn't require attendance to get the notes and access to the labs). [However for research the IDESA course is a fantastic introduction to 90nm and below.]

See you at EWME 2010 in Darmstadt!

Wednesday, January 27, 2010

IBM PowerPC™ 405-S: Verification of the Gate Level Netlist with the supplied Artisan SRAMs

[Please note that these instructions reference the 1.00a version of the Synopsys PowerPC™ 405-S coreKit which is the version provided in the University IP package from IBM. There are updated versions available from Synopsys.]

This is the fourth article in this series (click here for the series index) describing the IBM PowerPC™ 405 synthesizable core and peripherals and will cover functionally and formally verifying the synthesized gate-level structural Verilog netlist.

Firstly let us look at the two main classes of verification:
  1. Formal Verification
  2. Functional Verification
In a sentence, functional verification tests if a circuit executes one or more testcase(s) correctly while formal verification attempts to show that two circuits are mathematically equivalent (for a certain value of equivalent).

Luckily for us the coreKit provides scripts and support for automating both activities. The completed netlist (with scan chains, more on their importance later) can be put into the same simulation framework that verifies the RTL is fully functional. The completed netlist can also be formally verified against the source RTL by using Synopsys Formality™.

Formal Verification

The relevant part of the instruction manual (docs/405_ivug.pdf) is:

Chapter 9: Gate-Level Verification


So for the following process assume you have used this series of articles to synthesize a PowerPC™ 405-S core with scan chains using the Artisan PDK and the files in the list below are in their correct locations.

You will need the following:
  1. A synthesized gate level netlist in structural verilog (typically ./dc/netlist/PPC405F5V1_dft.v)
  2. Your technology library in Synopsys DB form (if you are following the supplied Artisan flow with SRAMs it is ./tech_lib/artisan_13lvfsg/syn/slow.db)
  3. Synopsys Formality™, Synopsys' premier formal equivalence checker
Command:
From the ./fm directory in the current workspace:

./run_fm.gate ../src/ ../tech_lib/artisan_13lvfsg/syn/slow.db ../dc/netlist/PPC405F5V1_dft.v $SYNOPSYS

Output:
Assuming you entered the command correctly Formality™ should start and the process will begin by synthesising the original RTL. This will produce a large number of warnings but in this case this is normal and expected.

It then sets a few constants disabling the scan mode in the design:

set_constant ref:/WORK/$MODULE/TESTC405SCANENABLE -type port 0
Set 'ref:/WORK/PPC405F5V1/TESTC405SCANENABLE' to constant 0
1
set_constant impl:/WORK/$MODULE/TESTC405SCANENABLE -type port 0
Set 'impl:/WORK/PPC405F5V1/TESTC405SCANENABLE' to constant 0
1

It now checks the designs and issues warnings for the RAM models.
[A “black box” is the term for a section or module of the circuit that is not visible to the tool so it is warning that it cannot verify the contents.]

Status:  Checking designs...
    Warning: Design ref:/DATARAM_64X34/dataram_64X34 is a black box and there are cells referencing it (FM-160)
    Warning: Design ref:/SRAM256X46/sram256x46 is a black box and there are cells referencing it (FM-160)
    Warning: Design ref:/SRAMBYTWR512X128/sramBytWr512x128 is a black box and there are cells referencing it (FM-160)
    Warning: Design ref:/SRAM512X8/sram512x8 is a black box and there are cells referencing it (FM-160)

And finally compares the design at the matching points, producing this worrying looking output:

Status:  Verifying...
....
    Compare point C405TESTSCANOUT0 failed (is not equivalent)
    Compare point C405TESTSCANOUT1 failed (is not equivalent)
    Compare point C405TESTSCANOUT2 failed (is not equivalent)
    Compare point C405TESTSCANOUT3 failed (is not equivalent)
    Compare point C405TESTSCANOUT4 failed (is not equivalent)
    Compare point C405TESTSCANOUT5 failed (is not equivalent)
    Compare point C405TESTSCANOUT6 failed (is not equivalent)
.
    Compare point C405TESTSCANOUT7 failed (is not equivalent)
.

********************************* Verification Results *********************************
Verification FAILED
   ATTENTION: RTL interpretation messages were produced during link
              of reference design.
              Verification results may disagree with a logic simulator.
   ATTENTION: 8 failing compare points have unmatched undriven signals in their
                reference fan-in.  To report such failing points, use
                "report_failing_points -inputs unmatched -inputs undriven".
              8 such failing compare points are directly undriven primary output ports.
              To report directly undriven failing primary output ports, use
                "report_failing_points -point_type directly_undriven_output".
              To suppress verification of directly undriven primary output ports, use
                "set_dont_verify_point -directly_undriven_output".
              To read about undriven signal handling, use
                "man verification_set_undriven_signals".
----------------------------------------------------------------------------------------
 Reference design: ref:/WORK/PPC405F5V1
 Implementation design: impl:/WORK/PPC405F5V1
 16662 Passing compare points
 8 Failing compare points
 0 Aborted compare points
 0 Unverified compare points
----------------------------------------------------------------------------------------
Matched Compare Points     BBPin    Loop   BBNet     Cut    Port     DFF     LAT   TOTAL
----------------------------------------------------------------------------------------
Passing (equivalent)        1204       0       0       0     578   14679     201   16662
Failing (not equivalent)       0       0       0       0       8       0       0       8
Not Compared
  Clock-gate LAT                                                               1       1
****************************************************************************************
Info:  Formality Guide Files (SVF) can improve verification success by automating setup.
0 
 
However, don't be downhearted! All Formality™ has done is its job and spotted a difference: the scan chains you inserted during the compile are not present in the source RTL. So ignoring the scan chains, as far as the tool can tell, this is the design you wanted the synthesis tool to manufacture!

Functional Verification

This is described in the same section of the manual as Formal Verification. The prerequisite is a fully functioning RTL verification environment - if it didn't work on the RTL it isn't going to on the netlist.

The relevant files are:
  • sim/testbench/p405s_test_top.v - this is the Verilog file containing the top level testbench. It contains a very useful parameter: parameter simulation_cycle = 100; near the top of the file which sets the whole simulation speed in terms of the clock applied to the CPU (not the PLB). It is also sensitive to a parameter called GATE_SIM. This changes the behavior of the testbench to accommodate the gate-level netlist
  • sim/scripts/run405.config - this file contains the parameters of the simulation "run" about to take place and needs editing to support the gate-level netlist simulation
Modifications to scripts/run405.config

This is the key script which needs a couple of modifications to simulate the netlist and not the RTL. It is a list of parameters organised into blocks which are read by the Perl script scripts/runTB. scripts/runTB then builds and executes the correct command line to run a particular simulation in your chosen simulator.

Open the script scripts/run405.config in your favourite text editor and look for the second block:

searchpath
./../src/rtl
./../src/mem_models
$SYNOPSYS/packages/gtech/src_ver
$SYNOPSYS/dw/sim_ver
./testbench
./vera/ver_shell
end

As you can see a 'block' has a header identifying it and is closed by an end keyword. The modification to this block is to comment out the ./../src/rtl by adding a # in front of the line like so:

searchpath
#./../src/rtl
./../src/mem_models
$SYNOPSYS/packages/gtech/src_ver
$SYNOPSYS/dw/sim_ver
./testbench
./vera/ver_shell
end

By doing this we exclude the unsynthesised RTL from being discovered by the simulator which is what we want. The netlist should contain the complete circuit.

Take a look at the third block:

differentfile
./../src/rtl/p405s_params.v
end

In this block we need to explicitly tell the simulator to compile the simulation models for the Artisan PDK cells from the technology library, the synthesised netlist itself and a wrapper called PPC405F5V1_soft.v:

differentfile
./../src/rtl/p405s_params.v
./../src/rtl/PPC405F5V1_soft.v
./../dc/netlist/PPC405F5V1_dft.v
./../tech_lib/artisan_13lvfsg/verilog/tsmc13_hs_modified_new.v
end

and finally we need to add the GATE_SIM variable already mentioned above. This makes the testbench behave slightly differently. When simulating the RTL the testbench forces the state of certain internal registers to 0 at the start of every run. With a gate-level netlist those registers may no longer exist, and certainly may not exist with the same name or hierarchy. To achieve the same effect when the GATE_SIM variable is defined at the start of each run the testbench places the chip into scan mode and clocks 0's through the scan chains forcing all flops to have a starting value of 0.

The correct place to set GATE_SIM depends on your simulator so have a look at the blocks in the bottom of this file:

# This is a string appended to the MTI invocations
mtioptions
+define+UNITSIM +nospecify +define+GATE_SIM
end

# This is a string appended to the NC invocations
ncoptions
+define+UNITSIM +define+SYN_RTL +nospecify
end

These two example blocks (there are others) refer to Mentor Graphic's ModelSim (mtioptions) and Cadence's NC Verilog (ncoptions). To add the define change the line to read as follows:

# This is a string appended to the NC invocations
ncoptions
+define+UNITSIM +define+SYN_RTL +nospecify +define+GATE_SIM
end

You should now be ready to fire up coreConsultant and begin.

[Note: You can run the tests outside of coreConsultant but it won't generate the nice, navigable HTML reports. To do this use the command runtest from the sim directory.]

Changes in coreConsultant

1) Run core consultant by running the command coreConsultant in a terminal window.

2) Open the workspace you created to run the RTL functional verification from the previous article and you should see the window below:

Don't press 'Apply' until the last step at the bottom or it will start the simulation immediately and there are other settings to change!

3) Ensure the settings are correct in the 'Simulator' selection before moving on to 'Select Test Options'

4) In 'Select Test Options' choose Run_Gate_Sims. Move onto 'Select Testsuite'.

5) In 'Select Testsuite' set the path to the Netlist and the simulation library:


6) You are all done. Click 'Apply' to start the simulations

The results of the running simulations are visible in the test.log file which can be monitored in realtime by executing the following command in a terminal window from the sim directory:

tail -f test.log

Studying this file you should see the following (for NC Verilog but each simulator will issue warnings of this kind):

During the compilation you will get a lot of complaints about unconnected nets. This is only sensible as the individual cells from the cell library have a fixed number of outputs, especially the flops, that cannot be removed. They come with the cell:
TLATNX8HS LOCKUP1 ( .D(dp_scReg_icuStatus1_regL2[5]), .GN(sc), .Q(scan_out2)
                  |
ncelab: *W,CUVWSP (../dc/netlist/PPC405F5V1_dft.v,123147|18): 1 output port was not connected:
ncelab: (../tech_lib/artisan_13lvfsg/verilog/tsmc13_hs_modified_new.v,28144): QN

After the compilation things should settle down into a format similar to this, one per test:

ncsim: 08.10-s006: (c) Copyright 1995-2008 Cadence Design Systems, Inc.
Loading snapshot worklib.p405s_test_top:v .................... Done
=== Verilog with Synopsys Vera ===
ncsim> source /home/esdcad/software/cadence/linux/ius81/tools/inca/files/ncsimrc
ncsim> run
 ++---------------------------------------------------------------------++
 ||                     VERA System Verifier (TM)                       ||
 ||        Version: A-2007.12 () -- Wed Dec  9 17:15:27 2009            ||
 ||             Copyright (c) 1995-2004 by Synopsys, Inc.               ||
 ||                      All Rights Reserved                            ||
 ||                                                                     ||
 ||      For support, send email to vera-support@synopsys.com           ||
 ||                                                                     ||
 ||  This software and the associated documentation are confidential    ||
 ||  and proprietary to Synopsys Inc.  Your use or disclosure of this   ||
 ||  software is subject to the terms and conditions of a written       ||
 ||  license agreement between you, or your company, and Synopsys, Inc. ||
 ++---------------------------------------------------------------------++
Vera: Loading main "p405s_top_vera" (path = "p405s_test_top.top_vera")
Vera: Loading ../../vera/lib/p405s_top.vro..
Vera: Loading ../../vera/lib/p405s_memory.vro..
Vera: Loading ../../vera/lib/p405s_slave.vro..
Vera: Loading main "p405s_dcrmon" (path = "p405s_test_top.dcrmon_model")
Vera: Loading ../../vera/lib/p405s_dcrmon.vro..
Vera: Loading main "p405s_dcr" (path = "p405s_test_top.dcr_model")
Vera: Loading ../../vera/lib/p405s_dcr.vro..
Vera: Loading main "p405s_isocm" (path = "p405s_test_top.isocm_model")
Vera: Loading ../../vera/lib/p405s_isocm.vro..
Vera: Loading main "p405s_dsocm" (path = "p405s_test_top.dsocm_model")
Vera: Loading ../../vera/lib/p405s_dsocm.vro..
Vera: Loading main "p405s_intr_ctrl" (path = "p405s_test_top.intr_controller")
Vera: Loading ../../vera/lib/p405s_intr_ctrl.vro..
Vera: Loading main "p405s_jtag" (path = "p405s_test_top.jtag_controller")
Vera: Loading ../../vera/lib/p405s_jtag.vro..
Vera: Loading main "p405s_monitor" (path = "p405s_test_top.monitor_slave_side")
Vera: Loading ../../vera/lib/p405s_monitor.vro..
>> Setting TEST_c405ScanEnable = 1 <<
Entering the PLB Initialization sequence time=0
Entering Reset delay sequence time=1000
Completing Reset delay sequence time=2999000
>> Setting TEST_c405ScanEnable = 0 <<
>> Forcing Reset of D-CACHE Tags <<
3002000 info: Slave0: is in Reset

3003000 info: Slave0: is in Reset

3004000 info: Slave0: is in Reset

3005000 info: Slave0: is in Reset

3006000 info: Slave0: is in Reset

3007000 info: Slave0: is in Reset

3008000 info: Slave0: is in Reset

3009000 info: Slave0: is in Reset

3010000 info: Slave0: is in Reset

3011000 info: Slave0: is in Reset

3012000 info: Slave0: is in Reset

3013000 info: Slave0: is in Reset

>> Forcing Reset of D-CACHE Tags <<
DCRMON (3797000): DCR Write : Addr = 44 WData = bfff0000
DCRMON (3907000): DCR Write : Addr = 0 WData = 0
DCRMON (3922000): DCR Write : Addr = 11 WData = 0
DCRMON (3977000): DCR Write : Addr = 8 WData = 0
DCRMON (3997000): DCR Write : Addr = 1 WData = 0
DCRMON (4001000): DCR Write : Addr = 9 WData = 0
DCRMON (4005000): DCR Write : Addr = a WData = 0
DCRMON (4014000): DCR Write : Addr = 20 WData = 0
DCRMON (4028000): DCR Write : Addr = 3 WData = 0
DCRMON (4109000): DCR Write : Addr = b WData = 0
DCRMON (4113000): DCR Write : Addr = c WData = 0
6488000 info: Slave0:  MasterId=1  Request for Write addr=00000010

6488000 info: Slave0: 1 to MultiByte Write

TESTCASE took 3474 Clocks
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
TESTCASE "addfvt" PASSED
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Vera: finish encountered at time 6493000 cycle    6494
        total       mismatch: 0
                   vca_error: 0
              fail(expected): 0
                       drive: 10739
                      expect: 0
                      sample: 49891
                        sync: 16493
Vera: finish encountered at time 6494000 cycle    6495
        total       mismatch: 0
                   vca_error: 0
              fail(expected): 0
                       drive: 6493
                      expect: 0
                      sample: 3199836
                        sync: 607518
Vera: finish encountered at time 6494000 cycle    6495
        total       mismatch: 0
                   vca_error: 0
              fail(expected): 0
                       drive: 19494
                      expect: 0
                      sample: 64898
                        sync: 32462
Vera: finish encountered at time 6494000 cycle    6495
        total       mismatch: 0
                   vca_error: 0
              fail(expected): 0
                       drive: 6517
                      expect: 0
                      sample: 32469
                        sync: 19482
Vera: finish encountered at time 6494000 cycle    6495
        total       mismatch: 0
                   vca_error: 0
              fail(expected): 0
                       drive: 3882
                      expect: 0
                      sample: 56947
                        sync: 12987
Vera: finish encountered at time 6494000 cycle    6495
        total       mismatch: 0
                   vca_error: 0
              fail(expected): 0
                       drive: 11903
                      expect: 0
                      sample: 43549
                        sync: 12987
Vera: finish encountered at time 6494000 cycle    6495
        total       mismatch: 0
                   vca_error: 0
              fail(expected): 0
                       drive: 6737
                      expect: 0
                      sample: 33265
                        sync: 13564
Vera: finish encountered at time 6494000 cycle    6495
        total       mismatch: 0
                   vca_error: 0
              fail(expected): 0
                       drive: 1
                      expect: 0
                      sample: 140
                        sync: 108
TESTCASE "addfvt" PASSED
You can see the Vera banner at the top, the individual Vera modules loading. The only difference to the RTL simulations are the following two lines:
>> Setting TEST_c405ScanEnable = 1 <<
and
>> Setting TEST_c405ScanEnable = 0 <<

which are found after the last Vera module loads at the start of the simulation. This is the effect of the GATE_SIM define added earlier and is showing the activation of the scan chains to zero the flops internally.

And finally you should be greeted with a report in coreConsultant that looks like this:


There, you have now formally and functionally verified your very own PowerPC™ 405-S netlist. All it needs is to be placed and routed (and finally functionally and formally verified again) and you are good to go!

Published with permission from IBM and Synopsys

Friday, January 15, 2010

A word from a Lecturer on the use Wikipedia to students everywhere

The use of Wikipedia as a reference is a depressingly common crime committed by students but it is spreading to professional areas as well.

Wikipedia is an excellent introduction to a topic, but never a reference source. I often look up things on Wikipedia but I never use it as a reference or for something important without confirming it by some real, named, published and reviewed document.

The best summation of the structural issues of Wikipedia come from the Sheldon cartoon by Dave Kellett  (with the authors permission, http://www.sheldoncomics.com/archive/071213.html):



(... and outside academia, God help you if your boss has found you have bet the company or a product on data gathered from Wikipedia that you haven't confirmed via other research.)

Real life works on named sources that are reliable, not ephemera.

Friday, December 4, 2009

IBM PowerPC™ 405-S: Synthesis of the RTL with the supplied Artisan Cell Library

[Please note that these instructions reference the 1.00a version of the Synopsys PowerPC™ 405-S coreKit which is the version provided in the University IP package from IBM. There are updated versions available from Synopsys that address several of the issues outlined below. These are specifically commented on where known.]

This is the third article in this series (click here for the series index) describing the IBM PowerPC™ 405 synthesizable core and peripherals and will cover taking the IP through RTL synthesis and scan chain insertion with the modern versions of the Synopsys tools using the supplied Artisan Library. Integrating a custom library will be addressed in a future article.

This article assumes you have used the previous articles to unpack and install the software to create a functioning simulation environment.

Getting Started: Understanding the Problem

The relevant part of the instruction manual (docs/405_ivug.pdf) is:

Chapter 6: RTL-to-Gates Synthesis

In this chapter we are going to assume that the default Core clock of 250MHz and PLB clock of 125MHz are acceptable, along with the 130nm Artisan library designed for TSMC processes.

Synthesis Scripts
Entering the dc subdirectory of the workspace shows the files in place however a little more digging and we can see that the dc/scripts directory is missing. Using an identical technique to the previous article we can create and populate it.
[Note, in a similar vein to the last article all of the changes will be summarised in a script and instructions at the end]

cd dc

mkdir scripts
cd scripts
ln -s ../../../DW_ppc405-FS_iv_1.0a_AV/dc/scripts/* ./

The script system automates the flow from RTL, Constraints and Cell Library to a gate level netlist:
  • run_compile.csh - as the "master" script it sets environment variables and then executes the desired tool, sourcing the setup.tcl script and the appropriate stage script where the stages are "elaboration", "synthesis" and "design for test"
  • scripts/p405s_elab.tcl - reads the environment variables set and executes Design Compiler to elaborate the design and save it in a .db file
  • scripts/p405s_compile.tcl - reads the environment variables set and executes Design Compiler on the elaborated design to synthesize the design and save the resulting netlist, reports and other useful outputs
  • scripts/p405s_dft.tcl - reads the environment variables set and executes Design Compiler on the synthesized design to stitch the scan chains together and enable design for test strategies
Constraints
This scripting system relies on reading design constraints held in another directory:

dc/constraints

A design without appropriate constraints is basically useless as RTL is insufficient to create a functional circuit. The constraints for this processor are complex due to the presence of 3 clock domains (core, peripheral bus and JTAG) and two test modes. The constraint files contain the following information:
  • PPC405F5V1_capture.cons - This is responsible for creating environment variables containing data on the various clock speeds and parameters. It also identifies the clock ports in the design. It also specifies some of the different clock domains.
  • PPC405F5V1_clock.cons - This file uses the variables calculated in the constraint file PPC405F5V1_capture.cons to apply constraints to the clocks in the design.
  • PPC405F5V1_compile_directives.cons - Instructs Design Compiler to fix multiple port nets, optimise a group of paths at a time for slack as opposed to one path at a time. It also enables a check on clock gating.
  • PPC405F5V1_drc.cons - This file constrains the fanouts of the nets and sets attributes on some of the clock control logic so it will pass DRC.
  • PPC405F5V1_functional.cons - Creates false paths and multicycle paths through the RAM BIST and Debug sections of the design to allow efficient synthesis.
  • PPC405F5V1_group_paths.cons – Only two constraints are added here grouping all the inputs and all the outputs together for efficient synthesis.
  • PPC405F5V1_in_out.cons – This applies driving gates to all inputs and loads to all outputs, again to improve the synthesis and increase the accuracy of the reporting Design Compiler is able to achieve. Importantly the characteristics of the clocks are further constrained in this file. Finally more false paths are assigned to the BIST structures, JTAG and test ports.
  • PPC405F5V1_shift.cons – This sets the period on the various clock lines of the design when the design is in one of its test modes.
  • PPC405F5V1_test.cons - This contains the complex constraints which drive Test Compiler when it inserts scan chains and test modes automatically. It includes defining the scan in/out ports and the state of the various pins in test mode. This presents a particular issue which will be addressed later as the syntax and capabilities of Deign Compiler have changed very significantly.
Getting Started: Planning the Solution

The first priority for us is that Design Compiler no longer supports saving the design in .db form so we need to repurpose the scripts to store the design in a more modern database. Design Compiler has also changed clock gating and operand isolation commands and compilation to accomodate modern 90nm and below CMOS. Considering that a major rebuild is necessary I am going to show you how to synthesis the design preparing for the use of Milkyway Databases and the Design Compiler Topographical feature. This will enable the porting of this design to 90nm CMOS and below in the future.

In the current system each tool is setup to use files from the Artisan library in the synthesis flow by a single Tcl script which detects what tool is running the script and then configures variables appropriately. This script is called at the start of each specific tool script. There is a lot of useful help in the current scripts as the series of improvements to Design Compiler rely on physical information from the cell library to improve its decisions. The major improvement for sub 90nm CMOS is done by the logic synthesis tool performing a quick preliminary cell placement so it can better model delay, capacitance and power costs of the design during synthesis (topographical mode).

In the following sections the updates required to each individual script are discussed.

Primary Stage Scripts

The main changes to all scripts was a change to save and load the design using the more modern .ddc database format.
[A normal Synopsys flow would store the design in the Milkyway database alongside physical data from the Cell Library but this version of the IP was not designed for this.]

Control Script: run_compile.csh

The master control script parses flags given to it on the command line and then executes Design Compiler in the tcl shell mode running the appropriate stage script. The only change to this script is:

Added a flag -topo to run Design Compiler in topological mode which will enable the use of Milkyway databases in Design Compiler.

[That being said if you are going to use the -ultra flag or the -topo flag you might want to edit the line dc_shell-t -topo -f ./scripts/p405s_compile.tcl | tee logs/p405s_compile.log to include the correct -num_cpus X where X matches the number of processors and licenses you have for faster compiles. This isn't trivial to add as an option to the .csh script without introducing a dependency on the getopts binary so I have left it undone.]

Setup: scripts/setup.tcl

The setup script is run at the start of each and every stage. It pulls in the libraries and sets up a few global variables.

The main changes needed to bring this script into line with modern tools is to include information that would normally be used by Physical Compiler in Design Compiler to support its Topological mode when the -topo mode is specified. This is necessary to manage the shift of importance away from gates (in 130nm and above) to wire lengths (in 90nm and below) which begin to dominate the performance of a design. The physical information is used to perform a rough placement allowing wireloads to be estimated much more accurately. It also has allowed the introduction of novel features such as congestion aware routing which will attempt to address routing issues at a netlist level.

This information is already in there as it was used by the static timing analyses and power estimation tools, but it needs to be enabled when the new topological flag is asserted.

Elaboration: scripts/p405s_elab.tcl

Elaboration consists of reading in HDL files and constructing a representation of the design using components from the cell libraries, i.e. parsing the source code into a generic technology (GTECH) gate-level netlist which is then saved.

The main improvements to the p405_elab.tcl scripts are:
  • Removal of obsolete commands
  • Conversion of the HDL commands from acs type to analyze types
  • Rewriting of the scan chain insertion script section to remove obsolete commands set_scan_signal with set_dft_signal and moving them into the p405s_compile.tcl script
Compilation: scripts/p405s_compile.tcl

This stage reads in the gate netlist created in the previous elaboration phase and maps it to the specific gates from the chosen cell library, applies constraints and synthesizes an actual working gate level netlist using the target technology.
  • Replacement of the -ultra flag with the compile_ultra command
  • Force the use of the compile_ultra command when in topographical mode
  • Modification of the compile command to insert the scan flops
  • Removal of nested if's by building a compile string which is evaluated
Design for Test: scripts/p405s_dft.tcl

In this phase the scan chains are built from the scan flops inserted into the design in the previous script. The design is then incrementally recompiled to minimize the scan logic and the appropriate reports and the test strategies are generated. Note that while few changes were needed to this script the scan constraints and options were completely replaced.
  • The script now preserves the use of the -high flag from synthesis to increase the effort of the incremental recompile
Constraints


There were only two constraint files that needed updating but one needed a complete rebuild to support the new syntax of Test Compiler.

constraints/PPC405F5V1_functional.cons

This file requires a slight modification to cope with a change in Design Compiler's internal bus naming style from signal[0] to signal<0>.


constraints/PPC405F5V1_test.cons

This file required a complete rebuild to manage the syntax changes in the new version of Test Compiler. The major changes are:
  • Replacement of create_test_clock, set_scan_signal, set_signal_type and set_test_hold with appropriate set_dft_signal
    declarations
  • Replacement of set_dft_optimization_configuration with set_dft_insertion_configuration
  • Reworking of set_scan_configuration flags

Synthesis: Solved

There is one script to link the needed files and one unified patch file which will update all the relevant tool scripts and constraint files to work correctly.

1) Open your favorite text editor and paste in the following script:

#!/bin/bash
#
# fix_dc_dir.sh
#
#Copyright Matthew Swabey (matthew@swabey.org).
#
#You are licensed to use this file in conjunction with the IBM PowerPC 405 coreKit and to make modifications as needed to this script as long as this copyright notice is kept.

#This script should be executed from the the new workspace after you have installed it with coreConsultant.

UNPACKED_COREKIT_DIR=/export/fastcad/disktmp/mas/IBM_PUB3/DW_ppc405-FS_iv_1.0a_AV

#Link the script files needed to synthesise the PPC405 into the workspace from the installed coreKit
cd dc
mkdir scripts
cd scripts
ln -s $UNPACKED_COREKIT_DIR/dc/scripts/* ./
cd ..

cd ..

#Apply the unified patch to update the build scripts and constraint scripts
patch -u -p0 < `dirname $0`/fix_dc.diff

2) Edit the top variable, UNPACKED_COREKIT_DIR to have the absolute path to the unpacked coreKit directory. There should be no spaces in the paths.

3) Save this file with a suitable name (e.g. fix_dc_dir.sh) in the new workspace you have created. Make the script executable using the command chmod u+x "filename".

4) Open a new blank file and paste in the following unified diff:

diff -ru ../../IBM/DW_ppc405-FS_iv_1.0a_AV/dc/constraints/PPC405F5V1_functional.cons ./dc/constraints/PPC405F5V1_functional.cons
--- ../../IBM/DW_ppc405-FS_iv_1.0a_AV/dc/constraints/PPC405F5V1_functional.cons 2005-10-17 16:31:05.000000000 +0100
+++ ./dc/constraints/PPC405F5V1_functional.cons 2009-11-09 18:18:07.608824678 +0000
@@ -3,6 +3,9 @@
 #
 #  Copyright (c) International Business Machines Corporation, 2005.
 #
+#  Modified by Dr Matthew Swabey matthew@swabey.org to support the modern
+#  synopsys design tools and techniques.
+#
 #  This file contains trade secrets and other proprietary and confidential
 #  information of International Business Machines Corporation which are
 #  protected by copyright and other intellectual property rights and shall
@@ -49,9 +52,9 @@
 #core|cpu_topSch|exe_topSch|loadSteeringSch|ocmData_NEG[*]  I _ _ * GPRWT+R
 #core|cpu_topSch|exe_topSch|gprLpIn[*]  I DC DC GPRWT+R _
 set DCU_DATA_NEG_PINS  [remove_from_collection [get_pins core/cacheMMUSch/DCU_data_NEG*] \
-                                               [get_pins "core/cacheMMUSch/DCU_data_NEG[16] \
-                                                          core/cacheMMUSch/DCU_data_NEG[17] \
-                                                          core/cacheMMUSch/DCU_data_NEG[18]"]]
+                                               [get_pins "core/cacheMMUSch/DCU_data_NEG<16> \
+                                                          core/cacheMMUSch/DCU_data_NEG<17> \
+                                                          core/cacheMMUSch/DCU_data_NEG<18>"]]
 set OCMDATA_NEG_PINS [get_pins core/cpu_topSch/exe_topSch/loadSteeringSch/OCM_dsData*]

 set_false_path -through $DCU_DATA_NEG_PINS -through [get_pins core/cpu_topSch/exe_topSch/loadSteeringSch/gprLpIn*]
@@ -83,11 +86,11 @@
 # Input of the mux(s) are DCU_data_NEG*
 #
 for {set x 0} {$x<32} {incr x} {
-  set_false_path -through core/cpu_topSch/exe_topSch/loadSteeringSch/DCU_data_NEG[$x] \
+  set_false_path -through core/cpu_topSch/exe_topSch/loadSteeringSch/DCU_data_NEG<$x> \
                         -through [get_pins core/cpu_topSch/exe_topSch/loadSteeringSch/dRegBypass*]
 }
 for {set x 0} {$x<32} {incr x} {
-  set_false_path -through core/cpu_topSch/exe_topSch/loadSteeringSch/DCU_data_NEG[$x] \
+  set_false_path -through core/cpu_topSch/exe_topSch/loadSteeringSch/DCU_data_NEG<$x> \
                        -through [get_pins core/cpu_topSch/exe_topSch/loadSteeringSch/gprLpIn*]
 }

diff -ru ../../IBM/DW_ppc405-FS_iv_1.0a_AV/dc/constraints/PPC405F5V1_test.cons ./dc/constraints/PPC405F5V1_test.cons
--- ../../IBM/DW_ppc405-FS_iv_1.0a_AV/dc/constraints/PPC405F5V1_test.cons       2005-10-17 16:31:05.000000000 +0100
+++ ./dc/constraints/PPC405F5V1_test.cons       2009-11-10 23:17:36.560210767 +0000
@@ -3,6 +3,9 @@
 #
 #  Copyright (c) International Business Machines Corporation, 2005.
 #
+#  Modified by Dr Matthew Swabey matthew@swabey.org to support the modern
+#  synopsys design tools and techniques.
+#
 #  This file contains trade secrets and other proprietary and confidential
 #  information of International Business Machines Corporation which are
 #  protected by copyright and other intellectual property rights and shall
@@ -17,13 +20,17 @@
 #
 # This file sets up the scan constraints
 #
+
 #============================================================================
 # Create Test Clocks
 #============================================================================
-create_test_clock CPMC405CLOCK        -waveform {60.0 70.0} -period 100.0
-create_test_clock CPMC405PLBSYNCCLOCK -waveform {60.0 70.0} -period 100.0
-create_test_clock JTGC405TCK          -waveform {60.0 70.0} -period 100.0
-create_test_clock TESTC405BISTCE0STCLK -waveform {60.0 70.0} -period 100.0
+
+set_dft_signal -view existing_dft -type ScanClock -port CPMC405CLOCK -timing [list 60.0 70.0]
+set_dft_signal -view existing_dft -type ScanClock -port CPMC405PLBSYNCCLOCK -timing [list 60.0 70.0]
+set_dft_signal -view existing_dft -type ScanClock -port JTGC405TCK -timing [list 60.0 70.0]
+set_dft_signal -view existing_dft -type ScanClock -port TESTC405BISTCE0STCLK -timing [list 60.0 70.0]
+
+#set_dft_equivalent_signals [list CPMC405CLOCK CPMC405PLBSYNCCLOCK JTGC405TCK TESTC405BISTCE0STCLK]

 #============================================================================
 #    Define scan ports
@@ -33,20 +40,14 @@
 set test_scan_in_port_naming_style "scan_in%s"
 set test_scan_out_port_naming_style "scan_out%s"

-## DFT DRC currently (2003.12) has a problem infering resets, so do it explicitly
-#if {[get_ports -quiet test_async_rst_n] == ""} {create_port test_async_rst_n}
-#set_signal_type test_async_inverted test_async_rst_n
+set_dft_signal -view existing_dft -type TestMode -port [get_port TESTC405TESTMODE] -active_state 1
+set_dft_signal -view existing_dft -type ScanEnable -port [get_port TESTC405SCANENABLE] -active_state 1
+set_dft_signal -view spec -type ScanEnable -port [get_port TESTC405SCANENABLE] -active_state 1

-set_dft_signal  test_mode        -port TESTC405TESTMODE
-set_scan_signal test_scan_enable -port TESTC405SCANENABLE
 for {set x 0} {$x<8} {incr x} {
-       set_scan_signal test_scan_in     -port TESTC405SCANIN${x}
-       set_scan_signal test_scan_out    -port C405TESTSCANOUT${x}
+       set_dft_signal -view spec -type ScanDataIn -port [get_port TESTC405SCANIN${x}]
+       set_dft_signal -view spec -type ScanDataOut -port [get_port C405TESTSCANOUT${x}]
 }
-#for {set x 0} {$x<32} {incr x} {
-#      set_scan_signal test_scan_in     -port TESTC405SCANIN${x} -chain scan${x}
-#      set_scan_signal test_scan_out    -port C405SCANOUT${x} -chain scan${x}
-#}

 #============================================================================
 #    Set up test protocol
@@ -68,16 +69,16 @@
 set_scan_element false [get_cell core/cacheMMUSch/icu_topSch/icuRamDataFlow/tagArray_Sch/icu_tag/tagSram]

 # Assert these input ports to specified values during test mode
-set_signal_type "test_asynch" RSTC405RESETCORE
-set_test_hold 1 TESTC405BISTCE1MODE
-set_test_hold 1 TESTC405BISTCE1ENABLE
-set_test_hold 1 TIEC405CLOCKENABLE
-set_test_hold 1 TIEC405DUTYENABLE
-set_test_hold 1 TESTC405TESTMODE
-set_test_hold 1 JTGC405TRSTNEG
-set_test_hold 1 CPMC405CPUCLKENCCLK
-set_test_hold 1 CPMC405TIMERCLKENCCLK
-set_test_hold 1 CPMC405JTAGCLKENCCLK
+set_dft_signal -view existing_dft -type Reset -port -active_state 1 [get_port RSTC405RESETCORE]
+set_dft_signal -view existing_dft -type Constant -active_state 1 -port [get_port TESTC405BISTCE1MODE]
+set_dft_signal -view existing_dft -type Constant -active_state 1 -port [get_port TESTC405BISTCE1ENABLE]
+set_dft_signal -view existing_dft -type Constant -active_state 1 -port [get_port TIEC405CLOCKENABLE]
+set_dft_signal -view existing_dft -type Constant -active_state 1 -port [get_port TIEC405DUTYENABLE]
+set_dft_signal -view existing_dft -type Constant -active_state 1 -port [get_port TESTC405TESTMODE]
+set_dft_signal -view existing_dft -type Constant -active_state 1 -port [get_port JTGC405TRSTNEG]
+set_dft_signal -view existing_dft -type Constant -active_state 1 -port [get_port CPMC405CPUCLKENCCLK]
+set_dft_signal -view existing_dft -type Constant -active_state 1 -port [get_port CPMC405TIMERCLKENCCLK]
+set_dft_signal -view existing_dft -type Constant -active_state 1 -port [get_port CPMC405JTAGCLKENCCLK]

 ## Set the default test parameters
 set test_default_period        100
@@ -88,19 +89,21 @@

 set test_stil_netlist_format verilog

-set_dft_optimization_configuration -preserve_design_name true
+set_dft_insertion_configuration -preserve_design_name true
 set_scan_configuration -style multiplexed_flip_flop
-set_scan_configuration -methodology full_scan
-set_scan_configuration -dedicated_scan_ports true
+set_scan_configuration -create_dedicated_scan_out_ports true
 set_scan_configuration -add_lockup true
 set_scan_configuration -clock_mixing mix_clocks
-set_scan_configuration -rebalance false
-set_scan_configuration -internal_clocks false
+set_scan_configuration -internal_clocks none
 set_scan_configuration -hierarchical_isolation false
 set_scan_configuration -create_test_clocks_by_system_clock_domain false
-set_scan_configuration -disable true
-set_scan_configuration -bidi input
-set_scan_configuration -insert_end_of_chain_lockup_latch true
+#set_scan_configuration -disable true
+set_dft_configuration -fix_bidirectional enable
+set_dft_configuration -fix_clock enable
+set_dft_configuration -fix_reset enable
+set_dft_configuration -fix_bus enable
+#set_scan_configuration -insert_end_of_chain_lockup_latch true
+set_scan_configuration -insert_terminal_lockup true

 ## Need to specify the number of chains.
 set_scan_configuration -chain_count 8
@@ -108,5 +111,5 @@

 # set the latches in the design to transperent mode
 # Make sure that your latch variable is set in the setup.tcl file
-set_scan_transparent true ${LIBRARY_NAME}/$LATCH_NAMES -existing
+#set_scan_transparent true ${LIBRARY_NAME}/$LATCH_NAMES -existing

diff -ru ../../IBM/DW_ppc405-FS_iv_1.0a_AV/dc/run_compile.csh ./dc/run_compile.csh
--- ../../IBM/DW_ppc405-FS_iv_1.0a_AV/dc/run_compile.csh        2005-10-17 16:31:05.000000000 +0100
+++ ./dc/run_compile.csh        2009-11-09 18:08:54.904611366 +0000
@@ -3,6 +3,9 @@
 # **************************************************************************
 #
 #  Copyright (c) International Business Machines Corporation, 2004.
+#
+#  Modified by Dr Matthew Swabey matthew@swabey.org to support the modern
+#  synopsys design tools and techniques.
 #
 #  This file contains trade secrets and other proprietary and confidential
 #  information of International Business Machines Corporation which are
@@ -43,6 +46,7 @@
 setenv scan_switch true
 setenv bist_switch false
 setenv ultra_switch false
+setenv topo_switch false
 setenv high_switch false
 setenv Test_lic true
 setenv Power_lic true
@@ -55,6 +59,7 @@
    echo "  -synth       : synthesize; will elaborate if elab switch is on or no elab db exists"
    echo "  -dft         : Stitches scan chain; valid only if a synthesis is completed first"
    echo "  -ultra       : use Ultra options that require an Ultra-Opt license"
+   echo "  -topo        : use Topographical synthesis"
    echo "  -high        : high effort compile"
    echo "  -bist        : Add Memory BIST to the design"
    echo "  -no_scan     : Do not use -scan with the compile options that require a Test-Compiler license"
@@ -100,6 +105,9 @@
        case x-ultra:
            setenv ultra_switch true
            breaksw
+       case x-topo:
+           setenv topo_switch true
+           breaksw
        case x-high:
            setenv high_switch true
            breaksw
@@ -115,6 +123,7 @@
            echo "  -bist               : Add Memory BIST to the design"
            echo "  -no_scan            : Do not use -scan with the compile options that require a Test-Compiler license"
            echo "  -ultra              : use Ultra options that require an Ultra-Opt license"
+           echo "  -topo               : use Topographical synthesis"
            echo "  -no_scan            : Do not use -scan with the compile options that require an Test-Compiler license"
            echo "  -no_power           : Do not use Power Compiler to add clock gating to the design"
            echo "  -clean_all          : Deletes all reports and dbs in the pass directories"
@@ -125,6 +134,7 @@
            echo "  -synth              : synthesize; will elaborate if elab switch is on or no elab db exists"
            echo "  -dft                : Stitches scan chain; valid only if a synthesis completed first"
            echo "  -ultra              : use Ultra options that require an Ultra-Opt license"
+           echo "  -topo               : use Topographical synthesis"
            echo "  -bist               : Add Memory BIST to the design"
            echo "  -no_scan            : Do not use -scan with the compile options that require an Test-Compiler license"
            echo "  -no_power           : Do not use Power Compiler to add clock gating to the design"
@@ -167,7 +177,11 @@
      echo "Elaborating Design"
      echo "##########################################################"
      echo ""
+  if ($topo_switch == true) then
+     dc_shell-t -topo -f ./scripts/p405s_elab.tcl | tee logs/p405s_elab.log
+  else
      dc_shell-t -f ./scripts/p405s_elab.tcl | tee logs/p405s_elab.log
+  endif
      echo ""
      echo "##########################################################"
      echo "Finished Elaboration"
@@ -182,7 +196,11 @@
      echo "Synthesizing Design"
      echo "##########################################################"
      echo ""
+  if ($topo_switch == true) then
+     dc_shell-t -topo -f ./scripts/p405s_compile.tcl | tee logs/p405s_compile.log
+  else
      dc_shell-t -f ./scripts/p405s_compile.tcl | tee logs/p405s_compile.log
+  endif
      echo ""
      echo "##########################################################"
      echo "Finished Synthesis"
@@ -197,7 +215,11 @@
      echo "Stitching scan chains"
      echo "##########################################################"
      echo ""
+  if ($topo_switch == true) then
+     dc_shell-t -topo -f ./scripts/p405s_dft.tcl | tee logs/p405s_dft.log
+  else
      dc_shell-t -f ./scripts/p405s_dft.tcl | tee logs/p405s_dft.log
+  endif
      echo ""
      echo "##########################################################"
      echo "Finished Stitching scan chains"
@@ -214,6 +236,7 @@
 unsetenv scan_switch
 unsetenv bist_switch
 unsetenv ultra_switch
+unsetenv topo_switch
 unsetenv high_switch
 unsetenv Test_lic
 unsetenv Power_lic
diff -ru ../../IBM/DW_ppc405-FS_iv_1.0a_AV/dc/scripts/p405s_compile.tcl ./dc/scripts/p405s_compile.tcl
--- ../../IBM/DW_ppc405-FS_iv_1.0a_AV/dc/scripts/p405s_compile.tcl      2005-10-17 16:31:05.000000000 +0100
+++ ./dc/scripts/p405s_compile.tcl      2009-11-20 20:04:16.248225918 +0000
@@ -3,6 +3,9 @@
 #
 #  Copyright (c) International Business Machines Corporation, 2005.
 #
+#  Modified by Dr Matthew Swabey matthew@swabey.org to support the modern
+#  synopsys design tools and techniques.
+#
 #  This file contains trade secrets and other proprietary and confidential
 #  information of International Business Machines Corporation which are
 #  protected by copyright and other intellectual property rights and shall
@@ -22,6 +25,7 @@
 set scan_switch  [getenv scan_switch]
 set high_switch  [getenv high_switch]
 set ultra_switch [getenv ultra_switch]
+set topo_switch  [getenv topo_switch]

 set SCRIPT compile

@@ -49,9 +53,9 @@
 # Read in elaborated design
 #
 #=================================================================================
-       set INPUT_DB "./db/${TOPDESIGN}_elab.db"
+       set INPUT_DB "./db/${TOPDESIGN}_elab.ddc"
         if { [file readable $INPUT_DB ] == 1} {
-          read_db $INPUT_DB
+          read_ddc $INPUT_DB
         } else {
           puts "FAILURE: Failed to find $INPUT_DB"
           exit 0
@@ -94,6 +98,7 @@
 # At this point we no longer need to suppress the non-unate clock warning
 # and we want to know if it happens after the compile.
 set suppress_errors {CMD-041 TIM-134}
+
 #=================================================================================
 #
 # Uniquify
@@ -101,7 +106,7 @@
 #=================================================================================
         uniquify
        link
-       write_file -format db -hier -o ./db/${FILENAME_PREFIX}_pre_compile.db
+       write_file -format ddc -hier -o ./db/${FILENAME_PREFIX}_pre_compile.ddc

 #=================================================================================
 #
@@ -129,28 +134,29 @@
 # Compile
 #
 #=================================================================================
-       if {$ultra_switch == "true"} {
-               set_ultra_optimization true -force
+       set COMPILE_STRING ""
+        if {($Power_lic == "true")} {
+          append $COMPILE_STRING " -gate_clock"
+       }
+       if {$scan_switch == "true"} {
+          append $COMPILE_STRING " -scan"
        }

-       if {$high_switch == "true"} {
-               if {$scan_switch == "true"} {
-                       echo "** compile -scan -map_effort high **"
-                       compile -scan -map_effort high -boundary_optimization
-               } else {
-                       echo "** compile -map_effort high **"
-                       compile -map_effort high -boundary_optimization
-               }
+       if {$ultra_switch == "true" | $topo_switch == "true"} {
+          set COMPILE_STRING "compile_ultra ${COMPILE_STRING} "
+          echo "** " ${COMPILE_STRING} " **"
+          eval ${COMPILE_STRING}
        } else {
-               if {$scan_switch == "true"} {
-                       echo "** compile -scan **"
-                       compile -scan -effort medium -boundary_optimization
-               } else {
-                       echo "** compile  **"
-                       compile -effort medium -boundary_optimization
-               }
+          if {$high_switch == "true"} {
+            set COMPILE_STRING "compile ${COMPILE_STRING} -map_effort high -boundary_optimization"
+            echo "** " ${COMPILE_STRING} " **"
+            eval ${COMPILE_STRING}
+          } else {
+            set COMPILE_STRING "compile ${COMPILE_STRING} -map_effort medium -boundary_optimization"
+            echo "** " ${COMPILE_STRING} " **"
+            eval ${COMPILE_STRING}
+          }
        }
-
         current_design ${TOPDESIGN}
         link

@@ -163,9 +169,8 @@
                source ./scripts/verilog_name_rules.tcl
                verilog_name_rules
                change_names -rule verilog_name_rule -hierarchy -verbose
-               change_names -rule verilog_name_rule -hierarchy -verbose
        }
-       write_file -format db -hier -o ./db/${FILENAME_PREFIX}.db
+       write_file -format ddc -hier -o ./db/${FILENAME_PREFIX}.ddc
        write -f verilog -hier -o ./netlist/${FILENAME_PREFIX}.v

 #=================================================================================
diff -ru ../../IBM/DW_ppc405-FS_iv_1.0a_AV/dc/scripts/p405s_dft.tcl ./dc/scripts/p405s_dft.tcl
--- ../../IBM/DW_ppc405-FS_iv_1.0a_AV/dc/scripts/p405s_dft.tcl  2005-10-17 16:31:05.000000000 +0100
+++ ./dc/scripts/p405s_dft.tcl  2009-11-20 20:58:02.696428891 +0000
@@ -3,6 +3,9 @@
 #
 #  Copyright (c) International Business Machines Corporation, 2005.
 #
+#  Modified by Dr Matthew Swabey matthew@swabey.org to support the modern
+#  synopsys design tools and techniques.
+#
 #  This file contains trade secrets and other proprietary and confidential
 #  information of International Business Machines Corporation which are
 #  protected by copyright and other intellectual property rights and shall
@@ -17,6 +20,7 @@
 #=================================================================================
 # This script inserts scan
 #=================================================================================
+set high_switch  [getenv high_switch]

 #=================================================================================
 #
@@ -43,9 +47,9 @@
 # Read in elaborated design
 #
 #=================================================================================
-       set INPUT_DB ./db/${TOPDESIGN}_compile.db
+       set INPUT_DB ./db/${TOPDESIGN}_compile.ddc
         if { [file readable $INPUT_DB ] == 1} {
-          read_db $INPUT_DB
+          read_ddc $INPUT_DB
         } else {
           puts "Error: Failed to find $INPUT_DB"
           exit 0
@@ -53,7 +57,7 @@

         current_design ${TOPDESIGN}
         link
-        echo "INFO: Elaborated DB file for ${TOPDESIGN} linked at [exec date]"
+        echo "INFO: Elaborated DDC file for ${TOPDESIGN} linked at [exec date]"

         # Check the design
         echo "INFO: link and check_design output for module ${TOPDESIGN}\n" > \
@@ -82,6 +86,10 @@
 # Insert DFT
 #=================================================================================

+if {$high_switch == "true"} {
+    set_dft_insertion_configuration -map_effort high
+}
+
 insert_dft
 # check_dft  -verbose  > ./reports/${FILENAME_PREFIX}.check_dft.post.rpt
 dft_drc -verbose > ./reports/${FILENAME_PREFIX}.dft_drc.post.rpt
@@ -97,7 +105,7 @@
                change_names -rule verilog_name_rule -hierarchy -verbose
                change_names -rule verilog_name_rule -hierarchy -verbose
        }
-       write_file -format db -hier -o ./db/${FILENAME_PREFIX}.db
+       write_file -format ddc -hier -o ./db/${FILENAME_PREFIX}.ddc
        write -f verilog -hier -o ./netlist/${FILENAME_PREFIX}.v

 #=================================================================================
@@ -126,5 +134,6 @@

 date

+
 exit

diff -ru ../../IBM/DW_ppc405-FS_iv_1.0a_AV/dc/scripts/p405s_elab.tcl ./dc/scripts/p405s_elab.tcl
--- ../../IBM/DW_ppc405-FS_iv_1.0a_AV/dc/scripts/p405s_elab.tcl 2005-10-17 16:31:05.000000000 +0100
+++ ./dc/scripts/p405s_elab.tcl 2009-11-09 18:17:58.598880874 +0000
@@ -3,6 +3,9 @@
 #
 #  Copyright (c) International Business Machines Corporation, 2005.
 #
+#  Modified by Dr Matthew Swabey matthew@swabey.org to support the modern
+#  synopsys design tools and techniques.
+#
 #  This file contains trade secrets and other proprietary and confidential
 #  information of International Business Machines Corporation which are
 #  protected by copyright and other intellectual property rights and shall
@@ -21,9 +24,7 @@
 set bist_switch  [getenv bist_switch]

 set template_naming_style "%s_%p"
-set hdlin_enable_vpp TRUE
-set hdlin_preserve_vpp_files true
-set compile_delete_unloaded_sequential_cells false
+#set compile_delete_unloaded_sequential_cells false

 sh rm -rf ./ppc405
 define_design_lib ppc405 -path ./ppc405
@@ -48,16 +49,26 @@
 set TOPDESIGN $design_name
 set FILENAME_PREFIX ${TOPDESIGN}_${SCRIPT}

+
 #=================================================================================
 #
 # Read in design files
 #
 #=================================================================================
 # read all files in ppc_405s_cpu directory except the following behavioral models
-set acs_exclude_list [concat $acs_exclude_list "PPC405F5V1_soft.v SRAM_256wordsX44bits.v"]
+set RTL_SRC_PATH ../src/rtl/
+set RTL_SRC_LIST [list]
+foreach FILE [glob -nocomplain -type f -directory $RTL_SRC_PATH -tails *.v] {
+  lappend RTL_SRC_LIST $FILE
+}
+
+set RTL_SRC_LIST [ lsearch -all -inline -not -exact $RTL_SRC_LIST PPC405F5V1_soft.v ]
+set RTL_SRC_LIST [ lsearch -all -inline -not -exact $RTL_SRC_LIST SRAM_256wordsX44bits.v ]

 # if bist enabled
 if {($bist_switch == "true")} {
+       echo "NOT UPDATED YET"
+       exit
        echo "** BIST ENABLED **"
        set acs_exclude_list [concat $acs_exclude_list "p405s_DCU_parityRAM.v"]
        set acs_exclude_list [concat $acs_exclude_list "p405s_DCU_ram16K.v"]
@@ -79,9 +90,10 @@
 # if the User Specific RAMs option is set in the Configure Activity
 set memory_wrappers " "
 set memory_wrappers [concat ../src/mem_models/p405s_dataram_64X34_wrapper.v  ../src/mem_models/p405s_sram256x46_wrapper.v ../src/mem_models/p405s_sram256x48_wrapper.v ../src/mem_models/p405s_sram512x128_wrapper.v ../src/mem_models/p405s_sram512x32_wrapper.v ../src/mem_models/p405s_sram512x8_wrapper.v ../src/mem_models/p405s_sramBytWr512x128_wrapper.v]
-set acs_hdl_source [concat $memory_wrappers ../src/rtl/p405s_params.v $acs_hdl_source "../src/rtl"]
+set RTL_SRC_LIST [concat $memory_wrappers ../src/rtl/p405s_params.v $RTL_SRC_LIST]

-acs_read_hdl -recurse -format verilog -library ppc405 -verbose $TOPDESIGN -no_elaborate
+lappend search_path ../src/rtl
+analyze -library ppc405 -format verilog $RTL_SRC_LIST

 #=================================================================================
 #
@@ -133,28 +145,14 @@
 # Elaborate the design
 #
 #=================================================================================
-if {($Power_lic == "true")} {
-  elaborate -library ppc405 -gate_clock $TOPDESIGN
-} else {
-  elaborate -library ppc405 $TOPDESIGN
-}

-#=================================================================================
-#
-# Connect the scan enable port to control the gated clock cells
-#
-#=================================================================================
-if {($Power_lic == "true")} {
-   set_scan_signal test_scan_enable -port [get_port TESTC405SCANENABLE]
-   set_dft_signal test_mode         -port [get_port TESTC405TESTMODE]
-   hookup_testports -verbose
-}
+elaborate -library ppc405 $TOPDESIGN

 #=================================================================================
 #
 # Write out elaborated design
 #
 #=================================================================================
-write -f db -hier -o $acs_work_dir/db/${FILENAME_PREFIX}.db
+write -f ddc -hier -o ./db/${FILENAME_PREFIX}.ddc

 exit
diff -ru ../../IBM/DW_ppc405-FS_iv_1.0a_AV/dc/scripts/p405s_zerowlm.tcl ./dc/scripts/p405s_zerowlm.tcl
--- ../../IBM/DW_ppc405-FS_iv_1.0a_AV/dc/scripts/p405s_zerowlm.tcl      2005-10-17 16:31:05.000000000 +0100
+++ ./dc/scripts/p405s_zerowlm.tcl      2009-11-20 20:04:35.505970139 +0000
@@ -45,9 +45,9 @@
 # Read in elaborated design
 #
 #=================================================================================
-       set INPUT_DB "./db/${TOPDESIGN}_elab.db"
+       set INPUT_DB "./db/${TOPDESIGN}_elab.ddc"
         if { [file readable $INPUT_DB ] == 1} {
-          read_db $INPUT_DB
+          read_ddc $INPUT_DB
         } else {
           puts "FAILURE: Failed to find $INPUT_DB"
           exit 0
@@ -55,7 +55,7 @@

         current_design ${TOPDESIGN}
         link
-        echo "INFO: Elaborated DB file for ${TOPDESIGN} linked at [exec date]"
+        echo "INFO: Elaborated DDC file for ${TOPDESIGN} linked at [exec date]"

         # Check the design
         echo "INFO: link and check_design output for module ${TOPDESIGN}\n" > \
@@ -104,7 +104,7 @@
 #=================================================================================
         uniquify
        link
-       write_file -format db -hier -o ./db/${FILENAME_PREFIX}_pre_compile.db
+       write_file -format ddc -hier -o ./db/${FILENAME_PREFIX}_pre_compile.ddc

 #=================================================================================
 #
@@ -159,7 +159,7 @@
                change_names -rule verilog_name_rule -hierarchy -verbose
        }
        write -f verilog -hier -o ./netlist/${FILENAME_PREFIX}.v
-       write_file -format db -hier -o ./db/${FILENAME_PREFIX}.db
+       write_file -format ddc -hier -o ./db/${FILENAME_PREFIX}.ddc

 #=================================================================================
 #
diff -ru ../../IBM/DW_ppc405-FS_iv_1.0a_AV/dc/scripts/setup.tcl ./dc/scripts/setup.tcl
--- ../../IBM/DW_ppc405-FS_iv_1.0a_AV/dc/scripts/setup.tcl      2005-10-17 16:31:05.000000000 +0100
+++ ./dc/scripts/setup.tcl      2009-11-09 18:17:58.609879585 +0000
@@ -3,6 +3,9 @@
 #
 #  Copyright (c) International Business Machines Corporation, 2005.
 #
+#  Modified by Dr Matthew Swabey matthew@swabey.org to support the modern
+#  synopsys design tools and techniques.
+#
 #  This file contains trade secrets and other proprietary and confidential
 #  information of International Business Machines Corporation which are
 #  protected by copyright and other intellectual property rights and shall
@@ -10,7 +13,7 @@
 #  or used for any purpose except as specifically authorized in writing by
 #  International Business Machines Corporation. This notice must be
 #  contained as part of this text at all times.
-#
+#
 # **************************************************************************
 #
 #===========================================================================
@@ -128,30 +131,46 @@
 #  create the MilkyWay database.
 #----------------------------------------------------------------------------
 set lib_path ../tech_lib
-#set MW_REF_LIB [list ${lib_path}/artisan_13lvfsg/astro/artisan_13lvfsg_mdb \
-#                     ${lib_path}/artisan_13lvfsg/syn/dataram_64X34 \
-#                     ${lib_path}/memories/artisan_13lvfsg/mdb/sram256x46 \
-#                     ${lib_path}/memories/artisan_13lvfsg/mdb/sram256x48 \
-#                     ${lib_path}/memories/artisan_13lvfsg/mdb/sram512x128 \
-#                     ${lib_path}/memories/artisan_13lvfsg/mdb/sram512x32 \
-#                     ${lib_path}/memories/artisan_13lvfsg/mdb/sram512x8 \
-#                     ${lib_path}/memories/artisan_13lvfsg/mdb/sramBytWr512x128 ]
+set MW_REF_LIB [list ${lib_path}/artisan_13lvfsg/astro/artisan_13lvfsg_mdb \
+                     ${lib_path}/artisan_13lvfsg/syn/dataram_64X34 \
+                     ${lib_path}/memories/artisan_13lvfsg/mdb/sram256x46 \
+                     ${lib_path}/memories/artisan_13lvfsg/mdb/sram256x48 \
+                     ${lib_path}/memories/artisan_13lvfsg/mdb/sram512x128 \
+                     ${lib_path}/memories/artisan_13lvfsg/mdb/sram512x32 \
+                     ${lib_path}/memories/artisan_13lvfsg/mdb/sram512x8 \
+                     ${lib_path}/memories/artisan_13lvfsg/mdb/sramBytWr512x128 ]
 set MW_TECH_FILE ${lib_path}/artisan_13lvfsg/apollo/tf/tsmc13fsg_hs_6lm.tf
-set MAX_TLUPLUS ${lib_path}/star-rc/t013s6ml_lowklv.tluplus
-set NOM_TLUPLUS ${lib_path}/star-rc/t013s6ml_lowklv.tluplus
-set MIN_TLUPLUS ${lib_path}/star-rc/t013s6ml_lowklv.tluplus
-set MAP_FILE    ${lib_path}/star-rc/plib2itf.map
-#
+
+set MW_DESIGN_LIBRARY ./mw.db
+
+echo $MW_REF_LIB
+
+if {$synopsys_program_name == "dc_shell"}  {
+  if {[shell_is_in_topographical_mode]} {
+    if {[expr {![file isdirectory $MW_DESIGN_LIBRARY ]}]} {
+      echo "INFO: Milkyway libary not found. Creating a new one."
+      create_mw_lib -technology $MW_TECH_FILE -mw_reference_library $MW_REF_LIB $MW_DESIGN_LIBRARY
+    }
+    open_mw_lib $MW_DESIGN_LIBRARY
+    check_library
+  }
+}
+
 #----------------------------------------------------------------------------
 # Use TLUPlus for RC Computation
-# If you do not have TLUPlus, Comment out the following lines
+# If you do not have TLUPlus, Comment out the following
 #----------------------------------------------------------------------------
-# -tech2itf_map  $MAP_FILE
+set MAX_TLUPLUS ${lib_path}/star-rc/t013s6ml_lowklv.tluplus
+set NOM_TLUPLUS ${lib_path}/star-rc/t013s6ml_lowklv.tluplus
+set MIN_TLUPLUS ${lib_path}/star-rc/t013s6ml_lowklv.tluplus
+set MAP_FILE    ${lib_path}/star-rc/tf2itf.map
+
 if { [info exists synopsys_program_name] } {
-   if { [regexp {^psyn} $synopsys_program_name] } {
+   if { [regexp {^psyn} $synopsys_program_name] | [shell_is_in_topographical_mode] } {
        set_tlu_plus_files -max_tluplus $MAX_TLUPLUS \
                         -min_tluplus $MIN_TLUPLUS \
-                       -plib2itf_map $MAP_FILE
+                       -tech2itf_map $MAP_FILE
+       check_tlu_plus_files
    }
 }

5) Save this file as fix_dc.diff in the same directory as the script - they must be in the same directory for this to work.

6) Open a terminal and cd into the top of the workspace you have created. Execute the script and you should see the following output:

patching file ./dc/constraints/PPC405F5V1_functional.cons
patching file ./dc/constraints/PPC405F5V1_test.cons
patching file ./dc/run_compile.csh
patching file ./dc/scripts/p405s_compile.tcl
patching file ./dc/scripts/p405s_dft.tcl
patching file ./dc/scripts/p405s_elab.tcl
patching file ./dc/scripts/p405s_zerowlm.tcl
patching file ./dc/scripts/setup.tcl

7) cd into the dc directory of your workspace and run the following command: ./run_compile.csh -elab -synth -dft

8) Eventually you should have a lovely PowerPC™ 405 gate level netlist in ./dc/netlist ...

9) ...

10) Profit!

Published with permission from IBM and Synopsys

Laird Tpcm 7250 is as good as Honeywell PTM7950 as thermal paste / interface for PC

[This is not very scientific, however it is notable. At 7.5W/m-K vs the installed SYY-157 at 15.7 W/m-K it performed better in real world lo...