View Single Post

Old 09-13-2008, 12:22 PM   #4
Northy
Junior Member
 
Join Date: Aug 2008
Location: Cornwall/Deutschland/Österreich
Posts: 344
Thanks: 0
Thanked 0 Times in 0 Posts
Northy is on a distinguished road
Default

Applications for GPGPU
Snagged from the wiki
  • Computer clusters or a variation of a parallel computing (utilizing GPU cluster technology) for highly calculation-intensive tasks:
    - High-performance clusters (HPC) (supercomputing) including distributed computing.
    - Grid computing (a form of distributed computing) (networking many heterogeneous computers to create a virtual computer architecture)
    - Load-balancing clusters (a server farm)
  • Physical based simulation and physics engines (e.g. Newtonian style physics models) inc. cloth, hair, fluid flow (liquids, smoke)
  • Segmentation – 2D and 3D
  • CT reconstruction
  • Fast Fourier transform
  • Tone mapping
  • Audio signal processing inc. for digital, analog & speech processing
  • Digital image processing
  • Video Processing
    - Hardware accelerated video decoding and post-processing (Vista has it. Come on Snow Leopard!!)
    - Hardware accelerated video encoding and pre-processing
  • Raytracing
  • Scientific computing - weather, climate forecasting, molecular modelling inc. X Ray Crsytallography
  • Bioinformatics[4][5]
  • Computational finance
  • Medical imaging
  • Computer vision
  • Neural networks
  • Cryptography and cryptanalysis

I'd imagine SIGGRAPH 08 & 09 will be buzzing with this stuff.




CUDA - Compute Unified Device Architecture.

Good long read here

An SDK and API - a C compiler and set of development tools for programmers to help use C to code "algorithms for execution" on the GPU. (graphics processing unit). Developed by NVIDIA, it requires an NVIDIA GPU to use CUDA (G8X upwards, including GeForce, Quadro & Tesla lines). It gives developers access to the native instruction set and memory of the massively parallel computational elements in CUDA GPUs. Initially the CUDA SDK made public Feb 2007. So through CUDA, the NVIDIA GPUs can be turned into powerful, programmable open architectures like today’s CPUs (Central Processing Units) simplistically as the wiki says.

What might be helped by this? For the gaming industry, physics calculations - including debris, smoke, fire, fluids. Wiki provides the links to BioMed Central | Full text | High-throughput sequence alignment using Graphics Processing Units and BioMed Central | Full text | CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment for the acceleration CUDA gives for non-graphical computation in computational biology/other fields.

Advantages over over general purpose computation on GPUs (GPGPU) using graphics APIs.
  • Uses the standard C language, with some simple extensions
  • Code can write to arbitrary addresses in memory.
  • CUDA exposes a fast shared memory region (16KB in size) that can be shared amongst threads. This can be used as a user-managed cache, enabling higher bandwidth than is possible using texture lookups.
  • Faster downloads and readbacks to and from the GPU
  • Full support for integer and bitwise operations

Cons:
  • CUDA-enabled GPUs are only available from Nvidia
  • Texture rendering & recursive functions are not supported
  • Deviation from the IEEE 754 standard.
  • Potential bottleneck of Bus bandwidth and latency between the CPU and the GPU.
  • Threads must run in groups of at least 32 threads that execute identical instructions simultaneously. Branches in the program code do not impact performance significantly, provided that each of 32 threads takes the same execution path; the SIMD execution model becomes a significant limitation for any inherently divergent task (e.g., traversing a ray tracing acceleration data structure).

You can see examples of what CUDA can do here (It's flash based).

Why? From Beyond 3D's article:
- Neither DirectX nor OpenGL are made with GPGPU as their primary design goals, thus limiting their performance
- Arbitrary reads and & writes to memory while bypassing the caching system (or flushing it) is still not supported in the Direct3D 10 API

AMD: Streaming Close to the Metal

CTM's commercial successor is the AMD Stream SDK, released in 2007.
Like CTM, Stream SDK provides tools for general-purpose access to AMD graphics hardware.

Differences:
"The idea behind CTM is that there is efficiency to be gained by giving an experienced programmer more direct control to the underlying hardware.
CTM is thus "fundamentally [an] assembly language. CUDA on the other hand aims to simplify GPGPU programming by exposing the system via a standard implementation of the C language. At this point in time, the underlying assembly language output (also known as "NVAsc") is not exposed to the application developer.

"CUDA exposes the NVIDIA G80 architecture through a language extremely close to ANSI C, and extensions to that language to expose some of the GPU-specific functionality. This is in opposition to AMD's CTM, which is an assembly language construct that aims ot be exposed through third party backends. The two are thus not directly comparable at this time."

Chipsets, graphics, handhelds, desktops, Visualisation, near-time and real-time rendering. Market area examples, for rigid body physics, matrix numerics, wave equation solving, biological sequence matching, finance.

GPGPUS: General-purpose computing on GPUs (graphics processing units)
From the wiki: Made possible "by adding programmable stages and higher precision arithmetic to the rendering pipelines, which allows software developers to use stream processing on non-graphics data."

Basically expanding the purpose of a GPU from just accelerating parts of the graphics timeline, to using it for general purpose computations, to also accelerate the computer's non-graphics related computations. There are certain restrictions in operation and programming - their effectiveness is suited for solving problems using stream processing - processing things in parallel - operating " in parallel by running a single kernel on many records in a stream at once."

A stream being "a set of records that require similar computation. Streams provide data parallelism."
Kernels are the functions that are being applied to each element in the stream. e.g. in GPUs, vertices & fragments are the elements in streams, with the kernels to be run on them being vertex & fragment shaders.

"The most common form for a stream to take in GPGPU is a 2D grid because this fits naturally with the rendering model built into GPUs. Many computations naturally map into grids: matrix algebra, image processing, physically based simulation, and so on."


Apple's position? OpenCL, on a post below.
__________________
George W. Bush ~ It's clearly a budget. It's got a lot of numbers in it.
Northy is offline