Is a GPU really just a kind of ASIC?

MentalNomad · January 9, 2020, 5:36pm

Not really.

An ASIC is an Application-Specific Integrated Circuit - a chip made to do just one thing, but do it efficiently.

GPUs, sometimes known as “Graphics Cards,” were first made to do images quickly, which sounds kinda ‘Application-Specific,’ but they do this by allowing for massive parallelization. Much of image processing (ray tracing, frame rendering, texture mapping) is very suitable for parallele processing. In Computing science, this is called “embarrassingly parallel,” but those are still just General tasks done in parallel.

Lots of tasks are embarassingly parallel: Embarrassingly parallel - Wikipedia

GPUs are used for flow modeling, 3-D modeling, neural networks, radio signal analysis (SETI), protein modeling (Folding), cancer and AIDS research (GPUGRID), vector math, engineering, scientific computing, self-driving cars, deep learning… and, oh yeah, cryptocurrency mining.

That’s not Application-Specific, it’s General!

GPUs are so much more flexible because, unlike an ASIC that does just one thing, a GPU presents an API to a programming language (like OpenCL or OpenMP) which lets the the software do many, many different things on the chip. Any non-trivial task that can be done in parallel (as opposed to being linear) can probably be assigned to the GPU and done faster than on a CPU.

I think of a GPU as a Parallel Processing Unit which was originally designed to process pixels in images, but is now usable for all sorts of general purposes.

Even an AI-Specific version of a GPU, like Google’s Tensor Processing Unit, is programmable for a variety of tasks!

MentalNomad · January 9, 2020, 5:48pm

Here’s an example of doing scientific calculations faster using GPUs, from Google’s AI Blog.

For this Physics problem, they wrote an

algorithm for approximating the ground state of either a periodic quantum spin chain (1D) or a lattice model on a thin torus (2D)

This is a task that can be set up in parallel, so a GPU can out-perform a CPU, even if it has many cores that can be used:

For CPU computations we used Xeon Skylake with 1,8, 16, and 32 cores. For GPU computations we used NVIDIA Tesla V100. For further reference, we also run equivalent numpy code using a single CPU.

Notice that chart is on a log scale, because the GPU massively out-performs the CPU - you need a log-scale chart to show the CPUs and the GPUs on the same chart, because the difference is huge, up to a factor of 1 hundred!

(If interested, look into the full paper, TensorNetwork on TensorFlow: A Spin Chain Application Using Tree Tensor Networks.)