Thursday, June 4, 2009

Seminar: Heterogeneous GPU Computing


The topic of the seminar was on the practical application of the GPU, a microprocessor fully optimized for common graphical applications, in computing parallel tasks. The GPU can be thought of as an analog to the CPU for any modern video card that a consumer can go out and buy at the store.

A large part of the talk was used to explain the differences between a CPU and a GPU. In both cases, the architecture of the silicon is crafted in such a way to allow for processing of what's known as "atomic" statements. As the name might imply, atomic statements are instructions that the CPU is able to process naturally, without any need for breaking the instruction down into more basic instructions. Atomic statements are therefore "basic" instructions.

A general purpose CPU like the Intel Core i7 is crafted so that it can, with a little extra work, compute just about any problem that a piece of software or hardware requires. A GPU on the other hand, like the NVidia GTX 295, is crafted so that it can quickly work on mathematical problems like matrix math. This is due to the fact that the graphics being displayed on a monitor, whether it be the Windows desktop or a rendered scene from a video game, is the result of numerous mathematical calculations on geometric objects. So instead of the problem being broken down into basic parts on the Core i7 (which adds considerable computational overhead), the problem can be immediately processed as is on the GTX 295.

For parallel computing tasks, such a math-oriented environment is ideal. Many parallel computing problems require the hardware to spend enormous amounts of time crunching floating-point numbers. Adding to the computational power is the amount of cores on a modern GPU; the current NVidia Tesla GPU contains 240 cores dedicated to processing. Naturally, the Tesla GPU outperforms the Core i7 almost threefold in computationally intensive tasks.

Of course, the disadvantage to the GPU implementation is its utterly abysmal performance on crunching general computational. The CPU is designed for a general case in mind, after all. An additional note that wasn't touched upon in the seminar is the misconception regarding a GPU's performance compared to a CPU. It seems that the amazing threefold advantage can only be achieved by problems termed "embarrassingly" parallel, or problems that are (laughably) easy to break down into parallel computing components.

I guess number-crunching fits that bill quite well.

Monday, June 1, 2009

Longer Lasting Digital Memory


In an era where a sizable portion of the world's literature is stored in some digital form or medium, the relative shelf life of such mediums leave a lot left to be desired. The irony is that while efforts in "archiving" the printed medium of centuries long past were intended to preserve such works, the printed medium will more than likely outlast the digital collection it has been copied into. This is due to theoretical limitations in utilizing semi-conductors for such tasks.

Researchers have discovered an alternative to the traditional semi-conductor approach by using some modern techniques in the field of nanotechnology. Digital information is usually stored in the medium as a machine-readable set of 1's and 0's. By storing an iron nanoparticle inside a carbon nanotube in one of two positional states, one can induce the iron to move between the states in the presence of electricity. This can effectively represent the machine-readable 1's and 0's required by the system.

Greater storage space can be achieved by packing components of the digital medium into dense clusters. This relationship is directly proportional: the greater the density of the medium given a physical space, the greater the storage offered. Unfortunately, semi-conductor placement also have an inversely proportional relationship to its shelf-life: the greater the density of the medium, the less the shelf-life will be. The nanotube system is believed to be relatively stable in this regard, as nanotubes can be packed as densely as needed while yielding the same shelf-life: over a billion years.

Nanoscale Reversible Mass Transport for Archival Memory [Nano Letters, ACS Publishing]