Very Large Scale Integration (VLSI)

Monday, 24 June 2019

Low Power Design Techniques

In today's IOT (Internet Of Things) world there are various wearable/Portable smart devices coming up in the market which are battery operated. These devices also need to be Power efficient such that it can run on battery for a long time. And here the concept of Low Power Design comes into existence.

Different types of strategies used to reduce power consumption. Some of them are listed below.

1. Clock Gating
Clock being the highest frequency toggling signal contributes maximum towards the dynamic power consumption in the SoC even when the flops that are being fed by the clock are not changing their state. So, it is practical to gate the clock from reaching the set of registers or maybe some block in the design. This will ensure that there is no switching activity due to change in the clock and hence reduction in dynamic power consumption.

2. Power gating
Power gating is a technique to shut down the power of a block when it is not required to be On. i.e In Mobile voice processing block can be shut down when the user is not having an incoming or outgoing call. This is the best method of reducing power consumption.

3. Multiple Vt Library cells
Nowadays the user provides the same cells with two different threshold voltage in the library. So that synthesis tools can choose cells depending on the requirement. With low Vt, sub-threshold leakage will increase but speed will also be higher. So for timing critical path synthesis tool will insert low Vt cells and at another path high Vt cell.

4. Dynamic voltage and frequency scaling
Dynamic Voltage and Frequency Scaling (DVFS) describes the use of two power saving techniques (dynamic frequency scaling and dynamic voltage scaling). In this technique same block can be working at the different voltage at the different time .i.e some time it is required to do high computation (complex equation solver) task then it needs more speed so it can operate at high voltage. While some time low computation is required so it can operate at a lower voltage.

5. Supply voltage reduction
As power is directly proportional to voltage (p =iv), with a reduction in voltage, power consumption will reduce. But again with a reduction in voltage will reduce switching speed as well.

6. Multi-voltage design
In SOC some block ( RAM) are such which require higher speed, so that block can be powered with higher voltage. While some block (Peripheral device) which does not needs high speed so that block can be powered with lower voltage, which in turn can reduce leakage power. In earlier days people used to have the same voltage for the whole design which makes it necessary to operate at high voltage. While this new technique, we can achieve leakage reduction.

In upcoming posts, we will discuss more on UPF(Unified Power Format) and low power verification.

Sunday, 26 March 2017

Understanding real, realtime and shortreal variables of SystemVerilog

This post will help you to understand the difference between real, realtime and shortreal data types of SystemVerilog and its usage.

We faced some issue with real and realtime variable while writing a timing check.

Below is a simplified example of that check.

`timescale 1ns/1fs;
module test;
real a,b;
realtime t1, t2;

initial
begin
#1ns;
t1 = $realtime;
#1.8ns;
t2 = $realtime;
b = 1.8;
a = t2 - t1;

if(a == b)
$display("PASS a = %f b = %f", a,b);
else
$display("FAIL a = %f b = %f", a,b);

end
endmodule

and here is what we got the display

FAIL a = 1.800000 b = 1.800000

How that happened !!! is really 1.800000 != 1.800000 !!!!

Now let's try something else, instead of using real we use shortreal

`timescale 1ns/1fs;
module test;
shortreal a,b;
realtime t1, t2;

initial
begin
#1ns;
t1 = $realtime;
#1.8ns;
t2 = $realtime;
b = 1.8;
a = t2 - t1;

if(a == b)
$display("PASS a = %f b = %f", a,b);
else
$display("FAIL a = %f b = %f", a,b);

end
endmodule

Now the result was as expected !!

PASS a = 1.800000 b = 1.800000

To understand this lets go to SystemVerilog LRM. As per LRM

The real data type is from Verilog-2001, and is the same as a C double.

The shortreal data type is a SystemVerilog data type, and is the same as a C float.

So float is 32 bit data type and double is 64 bit data type. This sounds cool but still how 1.800000 != 1.800000

No ! it's not only about being 32 or 64 bit data type its more about precision."Precision is the main difference where float is a single precision (32 bit) floating point data type, double is a double precision (64 bit) floating point data type ".

To understand this difference let's go beyond and print the values with more number of digits after decimal point.

In below example we have used both real and shortreal to see the difference.

`timescale 1ns/1fs;
module test;
real a,b;
shortreal c,d;
realtime t1, t2, t3, t4;

initial
begin
#1ns;
t1 = $realtime;
#1.8ns;
t2 = $realtime;
b = 1.8;
a = t2-t1;

if(a == b)
$display("Case1: PASS \na = %1.100f \nb = %1.100f", a,b);
else
$display("Case1: FAIL \na = %1.100f \nb = %1.100f", a,b);
end

initial
begin
#1ns;
t3 = $realtime;
#1.8ns;
t4 = $realtime;
d = 1.8;

c = t2-t1;

if(c == d)
$display("Case2: PASS \nc = %1.100f \nd = %1.100f", c,d);
else
$display("Case2: FAIL \nc = %1.100f \nd = %1.100f", c,d);
end

endmodule

Here is what the display is

Case1: FAIL
a = 1.7999999999999998223643160599749535322189331054687500000000000000000000000000000000000000000000000000
b = 1.8000000000000000444089209850062616169452667236328125000000000000000000000000000000000000000000000000

Case2: PASS
c = 1.7999999523162841796875000000000000000000000000000000000000000000000000000000000000000000000000000000
d = 1.7999999523162841796875000000000000000000000000000000000000000000000000000000000000000000000000000000

It's clearly seen that the 64 bit real variable has more precision that that or 32 bit shortreal variable.

Here one more thing to consider is that we are taking difference of time. Floating point math is not exact. Simple values like 0.1 cannot be precisely represented using binary floating point numbers, and the limited precision of floating point numbers means that slight changes in the order of operations or the precision of intermediates can change the result. That means that comparing two floats to see if they are equal is usually not what you want.

The things will not matter much if you are doing calculations in nanoseconds so we suggest to use shortreal instead of real.

Sunday, 15 January 2017

2017 VLSI Symposia on VLSI Techology and Circuits

For the past 30 years, the combined annual Symposia on VLSI Technology and Circuits have provided an opportunity for the world’s top device technologists, circuit and system designers to engage in an open exchange of leading edge ideas at the world’s premier mid-year conference for microelectronics technology. Held together since 1987, the Symposia on VLSI Technology and Circuits have alternated each year between sites in the US and Japan, enabling attendees to learn about new directions in the development of VLSI technology & circuit design through the industry’s leading research and development presentations.

The comprehensive technical programs at the two Symposia are augmented with short courses, invited speakers and several evening panel sessions. Since 2012, the Symposia have presented joint focus sessions that include invited and contributed papers on topics of mutual interest to both technology and circuit attendees. A single registration enables participants to attend both Symposia.

Online paper submission:

Online paper submissions are now open for the 2017 Symposia on VLSI Technology and Circuits, to be held at the Rihga Royal Hotel in Kyoto, Japan from June 5 – 8, 2017. In a departure from previous years, both Symposia (VLSI Technology and VLSI Circuits) will be held on a fully overlapping schedule from June 6 – 8, preceded by Short Courses on June 5.

The deadline for paper submissions to both Symposia is January 23, 2017. Complete details for paper submission can be found online at: http://vlsisymposium.org/authors.html

This year’s Symposia theme is “Harmonious Integration Toward Next Dimensions.” Authors are encouraged to submit papers that showcase innovations that extend beyond single ICs and into the module level, with co-optimization of device technology and circuit/system design, including focus areas in the Internet of Things (IoT), industrial electronics, ‘big data’ management, artificial intelligence (AI), biomedical applications, virtual reality (VR) / augmented reality (AR), robotics and smart cars. These topics will be featured in focus sessions as part of the program.

The Symposium on VLSI Technology seeks technical innovation and advances in all aspects of IC technology, as well as the emerging IoT (Internet of Things) field, including:

IoT systems & technologies, including ultra-low power, heterogeneous integration, wearable devices, sensors, connectivity, power management, digital/analog, microcontrollers and application processors
Stand-alone & embedded memories, including technology & reliability for DRAM, SRAM, (3D-)NAND, MRAM, PCRAM, ReRAM and emerging memory technologies
CMOS Technology, microprocessors & SoCs, including scaling, VLSI manufacturing concepts and yield optimization
RF / analog / digital technologies for mixed-signal SoC, RF front end; analog, mixed-signal I/O, high voltage, imaging, MEMS, integrated sensors
Process & material technologies, including advanced transistor process and architecture, modeling and reliability; alternate channel; advanced lithography, high-density patterning; SOI and III-V technologies, photonics, local interconnects and Cu/optical interconnect scaling
Packaging technologies & System-in-Package (SiP), including through-silicon vias (TSVs), power & thermal management, inter-chip communication, 3D-system integration, as well as yield & test issues
Photonics Technology & ‘Beyond CMOS’ devices

The Symposium on VLSI Circuits seeks original papers showcasing technical innovations and advances in the following areas:

Digital circuits, processors and architectures, including circuits and techniques for standalone and embedded processors
Memory circuits, architectures & interfaces for volatile and non-volatile memories, including emerging memory technologies
Frequency generation and clock circuits for high-speed digital and mixed-signal applications
Analog and mixed-signal circuits, including amplifiers, filters and data converters
Wireline receivers & transmitters, including circuits for inter-chip and long-reach applications
Wireless receivers & transmitters, including circuits for WAN, LAN, PAN, BAN, inter-chip and mm-wave applications
Power conversion circuits, including battery management, voltage regulation, and energy harvesting
Imagers, displays, sensors, VLSI circuits & systems for biomedical, healthcare and wearable applications

Joint Technology & Circuits focus sessions feature invited and contributed papers highlighting innovations and advances in the following areas of joint interest:

IoT /ULP (Internet of Things / Ultra Low Power) devices: Advanced CMOS processes for ULP, design enablement, design for manufacturing, process/design co-optimization, on-die monitoring of variability and reliability
New Computing: Artificial intelligence, ‘beyond von Neumann’ computing, machine learning, neuromorphic & in-memory / in-sensor computing
2D MOSFETs / New concepts for channel & gate materials: Graphene, MoS2, α-Si / poly-Si or flexible organic materials for ‘More than Moore’ devices
Emerging memory technology & design: SRAM, DRAM, Flash, PCRAM, RRAM, and MRAM, Memristor, 3D Xpoint memory technologies
Design in scaled technologies: scaling of digital, memory, analog and mixed-signal circuits in advanced CMOS processes
3D & heterogeneous integration: power and thermal management; inter-chip communications, SIP architectures and applications

Best Student Paper Award

Awards for best student paper at each Symposia are chosen based on the quality of the papers and presentations. The recipients will receive a monetary award, travel cost support and a certificate at the opening session of the 2018 Symposium. For a paper to be reviewed for this award, the author must be enrolled as a full-time student at the time of submission, must be the lead author and presenter of the paper, and must indicate on the web submission form that the paper is a student paper.

Sponsoring Organizations

The Symposium on VLSI Technology is sponsored by the IEEE Electron Devices Society and the Japan Society of Applied Physics, in cooperation with the IEEE Solid State Circuits Society.

The Symposium on VLSI Circuits is sponsored by the IEEE Solid State Circuits Society and the Japan Society of Applied Physics, in cooperation with the Institute of Electronics, Information and Communication Engineers and the IEEE Electron Devices Society.

Further Information, Registration and Official Call for Papers

Visit: http://www.vlsisymposium.org.

Monday, 21 November 2016

Advantages of Python over Perl

In the new competitive generation of chip designing where Time-to-Market is so critical and also the complexity of designs is increasing exponentially. Adding to that it is also observed that the Verification is always considered the longest pole and takes nearly 70% of the chip design life cycle. Hence any opportunity to automate a task that is repeatable more than once is considered of most importance to improve the verification productivity. This is where “scripting” skills are highly valuable for any Verification engineer.

After many years of writing design and verification automation scripts in Perl and Python, we would like to throw some light on the advantages of using Python.

Maintainability

As we all know, Perl is easy to write but hard to read, especially when someone else has written it. There are multiple ways of writing the same code. Add to this fact that many engineers take pride in writing highly obfuscated Perl that is a pain for others to read.

Maintainability is a critical aspect of any engineering project. Throwing away code and rewriting it is a productivity loss. Unfortunately, this happens a lot with Perl.

Python, on the other hand, has a clean syntax and typically there is only one way of doing what you want. Python code is hence much more readable. Even people who have never written Python code ever can understand it, as the syntax is very “pseudo-code” like. It is also easier to functionalize and modularize code in Python as the language naturally encourages this.

Re-usability

Perl is designed for use and throw. You write something in Perl, run it and then forget about it. It is very difficult to extend the functionality of a Perl script. Typically you would not have organized your code into functions, as Perl syntax does not encourage that. When you try adding some functionality to your Perl script you realize that re-writing it completely is better than re-using the earlier script and extending it.

Python syntax encourages re-usability. The mindset is different. When you write code in Python, you write with future re-usability in mind. This is really tough to do in Perl. Perl encourages shortcuts.

Scale

Writing large pieces of code (more than 50k lines) in Perl exposes the weaknesses in the language. Maintainability, performance, and packaging are big issues. Can I package my application in a way that doesn’t require users to download and install modules used by the application?

Perl encourages users to download and install modules as needed. IT departments are not comfortable with upgrading Perl installations on thousands of server farm nodes. It would be an IT nightmare.

Python distributions, on the other hand, come with a majority of the module libraries included. Also, Python allows the packaging of applications so users do not have to manually download and install all module and library dependencies needed to run an application.

Final words

Perl is great at some things. For example, it has fantastic regular expression capabilities (it can even combine multiple regexp’s and match all of them together!). Perl is a worthy successor to awk.

Bottom line: For use and throw scripts, Perl is great. But, if your code needs to be checked into a version control system and will potentially be modified by other people, I would prefer Python over Perl.

Saturday, 19 November 2016

Transaction Recording In Verilog Or System Verilog

As there is not yet a standard for transaction recording in Verilog or VHDL, ModelSim includes a set of system tasks to perform transaction recording into a WLF file. Transaction modeling allows users to raise the level of description, analysis and debugging of their designs to the transaction level. A transaction represents a transfer of high-level data or control information between the test bench and the design under test (DUT) over an interface or any sequence of signal transitions recorded in the simulation database as a transaction.

The API is the same for Verilog and SystemVerilog. As stated previously, the name "Verilog" refers both to Verilog and SystemVerilog unless otherwise noted.

The recording APIs for Verilog and VHDL are a bit simpler than the SCV API. Specifically, in Verilog and VHDL:

There is no database object as there is in SCV; the database is always WLF format (a .wlf file).

There is no concept of begin and end attributes All attributes are recorded with the system task $add_attribute() or add_attribute.

Your design code must free the transaction handle once the transaction is complete and all use of the handle for relations or attribute recording is complete. (In most cases, SystemC designs ignore this step since SCV frees the handle automatically.)

A transaction has a begin time, an end time, and attributes. Examples of transactions include read operations, write operations, and packet transmissions. The transaction level is the level at which many users think about the design, so it is the level at which you can verify the design most effectively.

Transactions are recorded on a stream. A stream is a collection of transactions, recorded over time. A stream has a name, and usually exists somewhere within the test bench hierarchy – for example a driver might have a stream which represents all the transactions that have occurred on that driver. Each driver defines a collection of attributes ( transaction items ) which are defined by users, and which are meaningful to the transaction. The values of attributes are set for each transaction. Finally, transactions can be linked to each other. A link has a direction and a user-defined name, and specifies a relation between the two transactions.

module top;

integer stream, tr;

initial begin

stream = $create_transaction_stream("Stream");

#10;

tr = $begin_transaction(stream, "Tran1");

$add_attribute(tr, 10, "beg");

$add_attribute(tr, 12, "special");

$add_attribute(tr, 14, "end");

#4;

$end_transaction(tr);

$free_transaction(tr);

end

endmodule

Here,

1. $create_transaction_stream() is used to define a transaction stream. You can use this system task to create one or more stream objects.

module top;
integer hStream

initial begin
hStream = $create_transaction_stream("stream", "transaction");
.
.
end
.
.
endmodule

2. $begin_transaction is used to start a transaction by providing a valid handle of the transaction as shown below.

integer hTrans;
.
.
hTrans = $begin_transaction(hstream, "READ");

In this example, we begin a transaction named "READ" on the stream already created. The $begin_transaction system function accepts other parameters to specify: the start time for the transaction, and any relationship information, including its designation as a phase transaction.

The return value is the handle for the transaction. It is needed to end the transaction or record attribute.

3. $end_transaction has a single required argument – the handle of the transaction that is to be ended. It also has a single optional argument, the time in the past that this transaction ended. After a transaction has been ended, the transaction handle can still be used to add attributes and create relations.

$end_transaction( handle transaction [, time endTime])

4. $free_transaction has a single argument – the handle of the transaction to be deleted. Once a transaction is deleted the handle becomes invalid. It cannot be used in any other recording interfaces.

$free_transaction (handle transaction)

5. $add_attribute has two required arguments – a transaction handle on which the attribute is to be created and the attribute that is to be recorded. There is one optional argument of type string named attributeName. This attributeName specifies an alias name for the attribute. If not specified, the name used for the attribute is the actual name of the SystemVerilog object.

$add_attribute( handle transaction, object attributeValue [, string attributeName])

6. $add_relation has three arguments – the first two are the two transaction handles which are related. The third argument is the string name of the relation.

$add_relation( handle sourceTransaction, handle targetTransaction, string relationshipName)

Saturday, 22 October 2016

Build smart tests using uvm_report_catcher

Today we will look into a very useful concept of UVM specially when we are doing any erroneous testing. Its all about modifying the severity, id, action, verbosity or the report string itself before the report is finally issued by the report server.

Normally in our test environment shout for error when any erroneous scenario like CRC error condition or generation of any error interrupt occurs. And we also write erroneous tests to test these scenarios where we end-up with error messages and tests fails.

The uvm_report_catcher is used to catch messages issued by the uvm report server. Upon catching a report, the catch method can modify the severity, id, action, verbosity or the report string itself before the report is finally issued by the report server. The report can be immediately issued from within the catcher class by calling the issue method.

The catcher maintains a count of all reports with FATAL, ERROR or WARNING severity and a count of all reports with FATAL, ERROR or WARNING severity whose severity was lowered. These statistics are reported in the summary of the uvm_report_server.

Example:

Report catcher class:

class error_report_catcher extends uvm_report_catcher;

//new constructor

virtual function action_e catch();

if(get_severity() == UVM_ERROR && get_id() == "MON_CHK_NOT_VALID") begin

set_severity(UVM_INFO);

return CAUGHT;

end

else begin

return THROW;

end

endfunction

endclass : error_report_catcher_c

Use of error catcher in testcase:

class erroneous_test extends base_test_class;

// report catcher to suppress errors

error_report_catcher error_catcher ;

/// \fn new_constructor

/// \fn build_phase

virtual function void build_phase(uvm_phase phase);

super.build_phase(phase);

error_catcher = new();

uvm_report_cb::add(null,error_catcher) ;

uvm_config_db#(int)::set(this,“uvc.tx_agent","is_active",UVM_ACTIVE);

// User configurations

env_cfg.print();

uvm_config_db#(env_config_c)::set(this, "*" , “env_cfg", env_cfg);

// Calling the error sequence

uvm_config_db#(uvm_object_wrapper)::set(this, “uvc.tx_agent.tx_sequencer.main_phase","default_sequence",valid_invalid_seq_c::type_id::get());

endfunction : build_phase

endclass : erroneous_test

Sunday, 11 September 2016

64 core processor from Chinese chip maker Phytium

While the world awaits the AMD K12 and Qualcomm Hydra ARM server chips to join the ranks of the Applied Micro X-Gene and Cavium ThunderX processors already in the market, it could be upstart Chinese chip maker Phytium Technology that gets a brawny chip into the field first and also gets traction among actual datacenter server customers, not just tire kickers.

Phytium Technology has announced a 64-core ARM server CPU, which according to the press release will deliver 512 gigaflops of performance. The new chip, known as FT-2000/64, is aimed at “high throughput and high performance servers.”

Phytium is a chip design enterprise, based in Tianjin, China. In March 2015, the company released its first products: the FT-1500A/4 and FT-1500A/16, 4-core and 16-core implementations, respectively of the ARMv8 design.

Phytium was on hand at last week’s Hot Chips 28 conference, showing off its chippery and laptop, desktop and server machines employing its “Earth” and “Mars” FT series of ARM chips. Most of the interest that people showed in the server variants, which are both based on variants of the “Xiaomi” core design that the company has cooked up based on ARMv8 intellectual property licensed from ARM Holdings. There is chatter that one of the three Chinese exascale machines, which we wrote about here, will employ a future Phytium processor, but we were unable to confirm this with the Phytium executives at the event. What we can tell you is that the first engineering samples of the two Earth ARM chips, the FT-1500A/4 and the FT-1500A/16, as well as the one Mars ARM chip, the FT-2000/64, are back from Taiwan Semiconductor Manufacturing Corp and that we saw systems running the Kylin Linux operating system (a variant of Canonical’s Ubuntu) at the Hot Chips event.

Here are the key chip features from the FT-2000/64 product page:

Process：Manufacturing with 28nm process
Core：Integrating sixty-four FTC661 cores
Frequency：Running at 1.5GHz~2.0GHz
Cache：Integrating 32MB L2 cache and extending 128MB LLC
Extension Interface：Integrating eight proprietary extension interfaces, each delivering 19.2GB/s effective r/w bandwidth
Memory Interface：Extending sixteen DDR3-1600 memory controllers, which can deliver 204.8GB/s memory access bandwidth.
I/O Interface：Integrating two x16 or four x8 PCIE Gen3 interface
Power：Max. power 100W
Package：FCBGA package with 2892 pins

No pricing was provided on the new chips, and it’s unclear from the press release if the product is available today. The next time we hear about the FT-2000/64 might very well be when it shows up in a TOP500 supercomputer. Stay tuned.