Skip to Content

FPGA

Field Programmable Gate Array.

Impulse C API in Brief

Posted in

DMA:

Prototype: co_memory co_memory_create(const char *name, const char *loc, size_t size);  
// "Name" is used by external application to identify this memory.
// Create 64 bytes memory from the heap. loc is architecture dependent.
co_memory_create("name", "heap0", 64);

Prototype: void co_memory_readblock(co_memory mem, unsigned int offset, void *buf, size_t buffersize);
Prototype: void co_memory_writeblock(co_memory mem, unsigned int offset, void *buf, size_t buffersize);
Prototype: void *co_memory_ptr(co_memory mem); // Return a C pointer to the buffer of the shared memory.

Fixed-Point Arithmetic:
Example of format specification - 1s8.23 => 1 sign bit, 8 bit integer, 23 bit fraction.
We can choose several modes: saturation, floor, ceil.

co_int16 a = (co_int16) FXCONST16(96,7); // a <- 96 in 1s8.7 format.
c = FXADD16(a,b,8); // Add a and b in 1s7.8 format.
c = FXMUL32(a,b,8); // Multiply a and b in 1s7.8 format.
c = FXDIV16(a,b,10); // c <- a/b in 1s6.9 format.

Macros: FXADD8(), FXADD16(), FXADD32(), FXMUL8(), FXMUL16(), FXMUL32(), ...

FPGA News

Posted in

FPGA (Field Programmable Gate Array) is an emerging technology in embedded systems and computation accelerator for high performance computing.

29 Apr 2007: Altera Announces FPGA-Based Accelerator Support for Intel Front Side Bus
16 Jan 2007: HP claims FPGA breakthrough
15 Jan 2007: Mitrion FPGA Supercomputing Platform Now Available With Xilinx Synthesis and Place & Route Software
15 Jan 2007: Xilinx offers faster FPGA design
14 Nov 2006: Computational Bottlenecks and Hardware Decisions for FPGAs
13 Nov 2006: XtremeData Announces Upcoming FPGA-Supercomputing Support for Mitrion Virtual Processor and Mitrion Development Platform
23 Oct 2006: Celoxica Ships HTX FPGA Acceleration Solution
23 Oct 2006: Lattice Announces Availability of Serial RapidIO Core From Mercury Computer Systems
23 Oct 2006: New Lattice Design Tools Deliver Comprehensive New Product Support, Innovative HDL Management Tool
21 Oct 2006: New Image Processing Development Platform Empowers System Designers
20 Oct 2006: QuickLogic - Ultra low-power FPGA family extends to 300,000 gates
20 Oct 2006: Xilinx Demonstrates Low Power Solutions for Radio Modem and Cryptography At MILCOM 2006
19 Oct 2006: Synplicity, Achronix collaborating on FPGA synthesis
18 Oct 2006: Xilinx Extends Platform FPGA with Microblaze Soft Processor
12 Oct 2006: FPGAs: Architectural Innovations Open Up New Applications
12 Oct 2006: Xilinx extends platform FPGA performance with MicroBlaze soft processor
10 Oct 2006: QuickLogic expands PolarPro family with 300K-gate FPGA
09 Oct 2006: ClearSpeed accelerates largest supercomputer in Japan
09 Oct 2006: ClearSpeed Accelerates Tokyo Tech Supercomputer
09 Oct 2006: Xilinx Extends Platform FPGA Performance with Award Winning MicroBlaze Soft Processor
21 Aug 2006: IP core speeds design of Interlaken-based networks
xx Jul 2006: FPGA design security
xx Jul 2006: Right architecture for 65nm FPGAs
27 Jun 2006: IBM, ClearSpeed team up on supercomputing

FPGA Optimization

Posted in

C-level Optimization such as in Impulse C

  1. Limit the amount of hardware resources used by introducing loops.
  2. Split arrays for multiple storage accesses. Storage for each array can be constructed to stream directly into local computation unit, i.e. parallel local memory accesses. Note that register bank can be read by many sources in the same cycle. Thus, small & hot data should go into register.
  3. Improve communication performance by fully utilizing the CPU-FPGA bus width. Transfer more bits at a time matching the CPU-FPGA bus width. DMA is another feasible mechanism for communication. But it will hog the bus if you do it too frequently.
  4. Loop unrolling to realize higher parallelism using more gates in FPGA.
  5. Pipelining in main loop to close the communication gaps of different iterations due to data loading & flushing.

Reference: Optimizing Impulse C Code for Performance by Scott Thibault & David Pellerin.

Logic-level Optimization in Boolean Network model

  • Restructuring operations.
    • Reduce dependent inputs.
    • Factorize.
    • Substitute.
    • Eliminate.
  • Node minimization.
    • Minimize using dont-care inputs.
    • ...

Binary Decision Diagram

Posted in

BDD (Binary Decision Diagram) is a DAG (Directed Acyclic Graph). A node represents a logic function. A out-going directed edge is associated with the output of the node. Looked like a state machine ha?

Terminology

Posted in

CLB: Configurable Logic Block.
FPGA: Field Programmable Gate Array.
HDL: Hardware Description Language.
LUT: Look Up Table.

FPGA?

Posted in

FPGA (Field Programmable Gate Array) is becoming the buzz word in High Performance Computing. }:) 30x speed up results are commonly reported for various applications. Progeniq reported significant speed up on ClustalW software, a Bioinformatics application. :o

But, can you imagine using logic gates to perform tasks in a typical C program? Of course a C-to-FPGA compiler (e.g. Handel-C, Impulse C, SystemC) solves the problem. But, using FPGA does not automatically guarantee good speed up. The amount of logic gates, and the longest path in the logic gates must be minimized. These minimizations are difficult for the compiler. Hence, some efforts are required to help the compiler to achieve good performance.

The key for FPGA to success:

  1. Quick conversions of existing programs for wide user coverage.
  2. High performance, significantly faster than general purpose processors.
  3. Portable performance to various FPGA chips.
  4. Cheap is good.
Syndicate content