Featured post

Top 5 books to refer for a VHDL beginner

VHDL (VHSIC-HDL, Very High-Speed Integrated Circuit Hardware Description Language) is a hardware description language used in electronic des...

Saturday, 8 September 2012

Built-in Primitives

 

Formal Definition

The built-in primitives provide a means of gate and switch modeling.

Simplified Syntax

For and, nand, or, nor, xor, xnor, buf, not

  gate (drive_strength) #(2delays) instance_name[range] (list_of_ports);

For bufif0, bufif1, notif0, notif1

  gate (drive_strength) #(3delays) instance_name[range] (list_of_ports);

For nmos, pmos, rnmos, rpmos, cmos, rcmos, rtranif0, rtranif1, tranif0, tranif1

  gate #(3delays) instance_name[range] (list_of_ports);

For tran, rtran

  gate instance_name[range] (list_of_ports);

  pullup (pullup_strength) instance_name[range] (list_of_ports);

  pulldown (pulldown_strength) instance_name[range] (list_of_ports);

Description

Gate or switch declaration begins with a keyword that determines the type of a gate or switch followed by a strength and delay declaration, name of the instance, range, and a list of connections to the gate or switch ports. The strength and the delay declarations are optional. The name of an instance and a range are also optional. Instantiations of individual gate types are not identical.

and, nand, or, nor, xor, xnor gates

The instantiation of these logic gates (Example 1) can contain zero, one, or two delays. The strength declaration should contain two specified strengths - strength1 and strength0 (seeStrengths for more explanations).

These gates have one output and one or more inputs. The first port on the port list is output port.

and

0

1

x

z

0

0

0

0

0

1

0

1

x

x

x

0

x

x

x

z

0

x

x

x

or

0

1

x

z

0

0

1

x

x

1

1

1

1

1

x

x

1

x

x

z

x

1

x

x

xor

0

1

x

z

0

0

1

x

x

1

1

0

x

x

x

x

x

x

x

z

x

x

x

x

 

nand

0

1

x

z

0

1

1

1

1

1

1

0

x

x

x

1

x

x

x

z

1

x

x

x

nor

0

1

x

z

0

1

0

x

x

1

0

0

0

0

x

x

0

x

x

z

x

0

x

x

xnor

0

1

x

z

0

1

0

x

x

1

0

1

x

x

x

x

x

x

x

z

x

x

x

x

Table 1 Truth tables for logic gates

buf and not gates

The instantiation of these logic gates (Example 2) can contain zero, one, or two delays. The strength declaration should contain two specified strengths - strength1 and strength0 (see Strengths for more explanations).

These gates have one input and one or more outputs. The last port on the port list is an input port.

buf

input

output

0

0

1

1

x

x

z

x

not

input

output

0

1

1

0

x

x

z

x

Table 2 Truth tables for logic gates

bufif1, bufif0, notif1, notif0 gates

The instantiation of these tri-state gates (Example 3) can contain zero, one, two, or three delays. The strength declaration should contain two specified strengths - strength1 andstrength0 (see Strengths for more explanations).

These gates have three ports: the first is an output port, the second is a data port, and the third is a control port. The control port is used to set gates in high-impedance state.

bufif0

control input

 

0

1

x

z

data input

0

0

z

L

L

1

1

z

H

H

x

x

z

x

x

z

x

z

x

x

bufif1

control input

 

0

1

x

z

data input

0

z

0

L

L

1

z

1

H

H

x

z

x

x

x

z

z

x

x

x

Table 3 Truth table for tri-state logic gates

The L and H symbols have a special meaning. The L symbol means that the output has 0 or z value. The H symbol means that the output has 1 or z value. Any transition to H or L is treated as a transition to x.

nmos, pmos, rnmos, rpmos, cmos, and rcmos switches

The nmos switch is used to model N-type MOS (Metal-Oxide Semiconductor) transistor and the pmos switch is used to model P-type MOS (Metal-Oxide Semiconductor) transistor. The rnmos switch is used to model resistive nmos transistor and the rpmos switch is used to model resistive pmos transistor. The cmos switch should be treated as combination of a pmos switch and an nmos switch, which have common data input and data output. The rcmos switch should be treated as combination of an rpmos switch and an rnmos switch, which have common data input and data output.

The instantiation of these MOS switches (Example 4) can contain zero, one, two, or three delays.

The strength declaration is illegal. The nmos, pmos and cmos switches reduce supply strength of the signals to strong strength. Signals with others strengths are passed from input to output without a strength reduction. The rnmos, rpmos and rcmos switches reduce supply and strong strength of signals to pull strength. The pull strength of signals is reduced toweak. The large and weak strength of signals are reduced to medium. The medium strength of signals is reduced to small. Signals with other strengths are passed from input to output without strength reduction.

The nmos, pmos, rnmos, rpmos switches have three ports: the first is an output port, the second is a data port, and the third is a control port.

The cmos and rcmos switches have four ports: the first is an output port, the second is a data port, the third is a n-control port, and the fourth a is p-control port.

pmos

control input

rpmos

0

1

x

z

data input

0

0

z

L

L

1

1

z

H

H

x

x

z

x

x

z

z

z

z

z

nmos

control input

rmos

0

1

x

z

data input

0

z

0

L

L

1

z

1

H

H

x

z

x

x

x

z

z

z

z

z

 

cmos

N control

rmos

0

1

x

z

 

P control

 

0

1

x

z

0

1

x

z

0

1

x

z

0

1

x

z

Data

0

0

z

L

L

0

0

0

0

0

L

L

L

0

L

L

L

1

1

z

H

H

1

1

1

1

1

H

H

H

1

H

H

H

x

x

z

x

x

x

x

x

x

x

x

x

x

x

x

x

x

z

z

z

z

z

z

z

z

z

z

z

z

z

z

z

z

z

Table 4 Truth tables for MOS switches

Symbols L and H have a special meaning. The symbol L means that the output has 0 or z value. The symbol H means that the output has 1 or z value. Any transition to H or L is treated as a transition to x.

rtranif0, rtranif1, tranif0 and tranif1 switches

The instantiation of these bi-directional pass switches (Example 5) can contain zero, one, two, or three delays.

The strength declaration is illegal. The tranif0 and tranif1 switches reduce supply strength of signals to strong. Signals with others strengths are passed from input to output without strength reduction. The rtranif0 and rtranif1 switches reduce supply and strong strength of signals to pull. The pull strength of signals is reduced to weak. The large and weakstrength of signals are reduced to medium. The medium strength of signals is reduced to small. Signals with other strengths are passed from input to output without strength reduction.

The rtranif0, rtranif1, tranif0 and tranif1 switches have three ports: two bidirectional data ports and one control port (third position on port list).

tran and rtran switches

The instance of these bidirectional switches cannot contain delay and strength declaration.

The tran switches reduce supply strength of signals to strong strength. Signals with others strengths are passed from input to output without strength reduction. The rtran switches reduce supply and strong strength of signals to pull. The pull strength of signals is reduced to weak. The large and weak strength of signals are reduced to medium. The mediumstrength of signals is reduced to small. Signals with other strengths are passed from input to output without strength reduction.

The tran and rtran switches have two bidirectional data ports.

pullup and pulldown sources

The instantiation pullup and pulldown sources cannot contain delay declaration. The pullup can contain only strength1 specification (the strength0 declaration is optional). The pulldown can contain only strength0 specification (the strength1 declaration is optional).

The pullup source places a logic value 1 on connected signals. The pulldown source places a logic value 0 on connected signals.

Examples

Example 1

and (strong1, weak0) (y, i1, i2, i3);

This is a three-input and gate instance with strengths specified. There is no instance name and no delays declaration.

nand #(1,2) gate1 (y, i1, i2);

This is a two-input nand gate instance with two delays specified. The instance name is gate1 and there is no strength specified.

or #1 b[1:0] (y, i1, i2, i3, i4);

This is two instances of two four-input or gates with one delay specified. Names for the instances are b[1] and b[0]. There are no strengths specified.

Example 2

buf (o1, o2, o3, o4, i);

This is the instance of buf gate, which has four outputs and one input.

Example 3

bufif0 (weak1, pull0) #(4,5,3) (data_out, data_in, ctrl);

The bufif0 gate instance with strength and delays is specified. There is no instance name, which is optional.

Example 4

pmos (data_out, data_in, ctrl);
cmos (data_out, data_in, n_ctrl, p_ctrl);

The pmos and cmos switches are instantiated with no delays, strength and instance name declarations.

Important Notes

· Instantiations of gates and switches are different for individual types.

Block Statements

Formal Definition

The block statements provide a means of grouping two or more statements in the block.

Simplified Syntax

begin : name

  statement;

  ...

end

fork : name

  statement;

  ...

join

Description

The Verilog HDL contains two types of blocks:

· Sequential (begin-end blocks).

· Parallel (fork-join blocks).

These blocks can be used if more than one statement should be executed.

All statements within sequential blocks (Example 1) are executed in the order in which they are given. If a timing control statement appears within a block, then the next statement will be executed after that delay.

All statements within parallel blocks (Example 2) are executed at the same time. This means that the execution of the next statement will not be delayed even if the previous statement contains a timing control statement.

Examples

Example 1

begin
  a = 1;
  #10 a = 0;
  #5 a = 4;
end

During the simulation, this block will be executed in 15 time units. At time 0, the 'a' variable will be 1, at time 10 the 'a' variable will be 0, and at time 15 (#10 + #5) the 'a' variable will be 4.

Example 2

fork
  a = 1;
  #10 a = 0;
  #5 a = 4;
join

During the simulation this block will be executed in 10 time units. At time 0 the 'a' variable will be 1, at time 5 the 'a' variable will be 4, and at time 10 the 'a' variable will be 0.

Example 3

fork
  a = 1;
  @(b);
  a = 0;
join

During the simulation when this block is executed 'a' becomes 1 and when a change occurs on 'b', then 'a' changes to 0. Flow of the procedural block is suspended when the @(b) statement is reached awaiting a change in value of 'b' before procedure block activities resume.

Important Notes

· In parallel blocks all statements are executed at the same time, so there must not be any mutual dependent assignments.

Bit-select

Formal Definition

The bit-select provides an access to individual bits of vectors.

Simplified Syntax

vector_identifier[expression];

Description

The bit-select can be used to access individual bits of vector net or register data types. The bits can be addressed by using an expression. If the expression value is out of bounds or it returns z or x values, then the value returned by the reference is x. If one or more bits of the address returned by the expression have an x or z value, then the address expression is x.

The bit-select can be applied to any net vectors, regs, integers, and time register data types. The bit-selection of a register declared as real or realtime is illegal.

Examples

Example 1

reg [3:0] vect;
vect = 4'b0001;

If the value of address expression is 0 then returned value is 1 (vect[0] = 1).

If the value of address expression is 3 then returned value is 0 (vect[3] = 0).

If the value of address expression is 4 then returned value is x (vect[4] = x).

If the value of address expression is x or z then returned value is x (vect[1'bx] = x).

Example 2

reg [0:3] vect;
vect = 4'b0001;

If the value of address expression is 3 then returned value is 1 (vect[3] = 1).

If the value of address expression is 0 then returned value is 0 (vect[0] = 0).

Example 3

reg [7:0] vect;
vect = 4;

Fills vect with the pattern 00000100 (MSB is bit 7, LSB is bit 0).

Important Notes

  • If the address expression is out of bounds or it returns an x or z value, then the returned value is x.
  • The bit-select of real or realtime registers is illegal.

Thursday, 6 September 2012

Choosing FPGA or DSP for your Application

 

FPGA or DSP - The Two Solutions

The DSP is a specialised microprocessor - typically programmed in C, perhaps with assembly code for performance. It is well suited to extremely complex maths-intensive tasks, with conditional processing. It is limited in performance by the clock rate, and the number of useful operations it can do per clock. As an example, a TMS320C6201 has two multipliers and a 200MHz clock – so can achieve 400M multiplies per second.

In contrast, an FPGA is an uncommitted "sea of gates". The device is programmed by connecting the gates together to form multipliers, registers, adders and so forth. Using the Xilinx Core Generator this can be done at a block-diagram level. Many blocks can be very high level – ranging from a single gate to an FIR or FFT. Their performance is limited by the number of gates they have and the clock rate. Recent FPGAs have included Multipliers especially for performing DSP tasks more efficiently. – For example, a 1M-gate Virtex-II™ device has 40 multipliers that can operate at more than 100MHz. In comparison with the DSP this gives 4000M multiplies per second.

 

Where They Excel


When sample rates grow above a few Mhz, a DSP has to work very hard to transfer the data without any loss. This is because the processor must use shared resources like memory busses, or even the processor core which can be prevented from taking interrupts for some time. An FPGA on the other hand dedicates logic for receiving the data, so can maintain high rates of I/O.

A DSP is optimised for use of external memory, so a large data set can be used in the processing. FPGAs have a limited amount of internal storage so need to operate on smaller data sets. However FPGA modules with external memory can be used to eliminate this restriction.

A DSP is designed to offer simple re-use of the processing units, for example a multiplier used for calculating an FIR can be re-used by another routine that calculates FFTs. This is much more difficult to achieve in an FPGA, but in general there will be more multipliers available in the FPGA. 

If a major context switch is required, the DSP can implement this by branching to a new part of the program. In contrast, an FPGA needs to build dedicated resources for each configuration. If the configurations are small, then several can exist in the FPGA at the same time. Larger configurations mean the FPGA needs to be reconfigured – a process which can take some time.

The DSP can take a standard C program and run it. This C code can have a high level of branching and decision making – for example, the protocol stacks of communications systems. This is difficult to implement within an FPGA.

Most signal processing systems start life as a block diagram of some sort. Actually translating the block diagram to the FPGA may well be simpler than converting it to C code for the DSP.

Making a Choice

There are a number of elements to the design of most signal processing systems, not least the expertise and background of the engineers working on the project. These all have an impact on the best choice of implementation. In addition, consider the resources available – in many cases, I/O modules have FPGAs on board. Using these with a DSP processor may provide an ideal split.

 

As a rough guideline, try answering these questions:

  1. What is the sampling rate of this part of the system? If it is more than a few MHz, FPGA is the natural choice.
  2. Is your system already coded in C? If so, a DSP may implement it directly. It may not be the highest performance solution, but it will be quick to develop.
  3. What is the data rate of the system? If it is more than perhaps 20-30Mbyte/second, then FPGA will handle it better.
  4. How many conditional operations are there? If there are none, FPGA is perfect. If there are many, a software implementation may be better.
  5. Does your system use floating point? If so, this is a factor in favour of the programmable DSP. None of the Xilinx cores support floating point today, although you can construct your own.
  6. Are libraries available for what you want to do? Both DSP & FPGA offer libraries for basic building blocks like FIRs or FFTs. However, more complex components may not be available, and this could sway your decision to one approach or the other.

In reality, most systems are made up of many blocks. Some of those blocks are best implemented in FPGA, others in DSP. Lower sampling rates and increased complexity suit the DSP approach; higher sampling rates, especially combined with rigid, repetitive tasks, suit the FPGA.

 

Some Examples

Here are a few examples of signal processing blocks, along with how we would implement them:

  1. First decimation filter in a digital wireless receiver. Typically, this is a CIC filter, operating at a sample rate of 50-100MHz. A 5-stage CIC has 10 registers & 10 adds, giving an "add rate" of 500-1000MHz.
    At these rates any DSP processor would find it extremely difficult to do anything. However, the CIC has an extremely simple structure, and implementing it in an FPGA would be easy. A sample rate of 100MHz should be achievable, and even the smallest FPGA will have a lot of resource left for further processing.
  2. Communications Protocol Stack – ISDN, IEEE1394 etc; these are complex large pieces of C code, completely unsuitable for the FPGA. However the DSP will implement them easily. Not only that, a single code base can be maintained, allowing the code stack to be implemented on a DSP in one product, or a separate control processor in another; and bringing the opportunity to licence the code stack from a specialist supplier.
  3. Digital radio receiver – baseband processing. Some receiver types would require FFTs for signal acquisition, then matched filters once a signal is acquired. Both blocks can be easily implemented by either approach. However, there is a mode change – from signal acquisition to signal reception.
    It may well be that this is better suited to the DSP, as the FPGA would need to implement both blocks simultaneously. Note that the RF processing is better in an FPGA, so this is likely to be a mixed system.
    (Note – with today’s larger FPGAs, both modes of this system could be included in the FPGA at the same time.)
  4. Image processing. Here, most of the operations on an image are simple and very repetitive – best implemented in an FPGA. However, an imaging pipeline is often used to identify "blobs" or "Regions of Interest" in an object being inspected. These blobs can be of varying sizes, and subsequent processing tends to be more complex. The algorithms used are often adaptive, depending on what the blob turns out to be… so a DSP-based approach may be better for the back end of the imaging pipeline.

Summary

FPGA and DSP represent two very different approaches to signal processing – each good at different things. There are many high sampling rate applications that an FPGA does easily, while the DSP could not. Equally, there are many complex software problems that the FPGA cannot address.

Feature Detection on FPGA

 

Autonomous landing and roving on the Moon require fast computation. Space tolerant processors are insufficient for processing computer vision algorithms to land safely on the surface of the Moon. Field Programmable Gate Arrays (FPGAs) are capable of parallelizing the computations that a processor executes serially. An FPGA programmer has the ability of directly programming the logic fabric of the board. A processor is restricted to a set of assembly commands which it fetches, decodes, and executes in order to run a program. The video above executes several blurs on FPGA. Blurring is required for the Scale Invariant Feature Transform (SIFT) feature detection algorithm that is currently being developed.

Monday, 3 September 2012

Intel Chips Will Support Wireless Charging by 2014

Earlier this month it was rumored that Intel was developing a new wireless charging solution it could integrate into the Ultrabook platform. In so doing, it would remove the need to plug your Ultrabook into a power socket directly, instead placing it on a charging pad or at least near a power source.

Intel’s interest in wireless charging has today been confirmed through a new partnership with Integrated Device Technology (IDT). IDT will develop a new integrated transmitter and receiver for Intel, allowing for wireless charging using resonance technology from up to several feet away.

“Our extensive experience in developing the innovative and highly integrated IDTP9030 transmitter and multi-mode IDTP9020 receiver has given IDT a proven leadership position in the wireless power market,” said Arman Naghavi, vice president and general manager of the Analog and Power Division at IDT.

wireless_power_idtp9020_30_block_diagramAs for when we can expect to see an Intel-branded wireless charging system, IDT is working to provide samples in the first half of 2013, suggesting product integration should happen in time for the major holiday season at the end of next year.

IDT’s Gary Huang has also suggested that eventually wireless charging will expand to power everything on your desktop. So your wireless keyboard, mouse, backup storage device, smartphone, and PC/laptop will all be completely wireless, with each including the necessary components and battery to be charged.

The chipmaker is entering a market where there is already a proposed standard called Qi. Qi has received a wide array of support, including Energizer, Texas Instruments, Verizon, and phone manufacturers including Nokia, Research In Motion, LG, and HTC. 

Currently, 88 products are listed by the Wireless Power Consortium as being Qi-compatible, including phones from NTT DoCoMo and HTC.

Intel is not a part of that group, and its wireless charging effort is based ona platform created by IDT is apparently not Qi-compatible. Since Qi is already getting widespread support and Intel’s chips have made it in to very few mobile devices so far, Intel has some work ahead if it is to be a success. 

A completely wire-free desk at home sounds great to me, and if this IDT/Intel venture is successful it could be a reality within a year or two.