Thursday, June 4, 2009

Seminar: Heterogeneous GPU Computing


The topic of the seminar was on the practical application of the GPU, a microprocessor fully optimized for common graphical applications, in computing parallel tasks. The GPU can be thought of as an analog to the CPU for any modern video card that a consumer can go out and buy at the store.

A large part of the talk was used to explain the differences between a CPU and a GPU. In both cases, the architecture of the silicon is crafted in such a way to allow for processing of what's known as "atomic" statements. As the name might imply, atomic statements are instructions that the CPU is able to process naturally, without any need for breaking the instruction down into more basic instructions. Atomic statements are therefore "basic" instructions.

A general purpose CPU like the Intel Core i7 is crafted so that it can, with a little extra work, compute just about any problem that a piece of software or hardware requires. A GPU on the other hand, like the NVidia GTX 295, is crafted so that it can quickly work on mathematical problems like matrix math. This is due to the fact that the graphics being displayed on a monitor, whether it be the Windows desktop or a rendered scene from a video game, is the result of numerous mathematical calculations on geometric objects. So instead of the problem being broken down into basic parts on the Core i7 (which adds considerable computational overhead), the problem can be immediately processed as is on the GTX 295.

For parallel computing tasks, such a math-oriented environment is ideal. Many parallel computing problems require the hardware to spend enormous amounts of time crunching floating-point numbers. Adding to the computational power is the amount of cores on a modern GPU; the current NVidia Tesla GPU contains 240 cores dedicated to processing. Naturally, the Tesla GPU outperforms the Core i7 almost threefold in computationally intensive tasks.

Of course, the disadvantage to the GPU implementation is its utterly abysmal performance on crunching general computational. The CPU is designed for a general case in mind, after all. An additional note that wasn't touched upon in the seminar is the misconception regarding a GPU's performance compared to a CPU. It seems that the amazing threefold advantage can only be achieved by problems termed "embarrassingly" parallel, or problems that are (laughably) easy to break down into parallel computing components.

I guess number-crunching fits that bill quite well.

1 comment:

  1. I'm jealous! I wanted to go to this conference but I forgot! grr... Thanks for discussing what it was about.

    ReplyDelete