Learning AI Poorly: Why does AI run on graphics cards?

(Originally posted on LinkedIn)

To get ready for today’s article I spent this week attending random webinars about AI so I could write about the experience. Whenever a webinar was suggested by whatever algorithm, I signed up and watched. I thought I could make it funny but… you know… webinars… They’re terrible! Now what?

I’ve had dozens of people write me* asking, “Why does AI need graphics cards?” so let’s talk about that.

GPU? CPU? Wha? #

Your computer, phone, iPad, whatever runs on a chip called a Central Processing Unit (CPU) It is a piece of silicon with electronic circuits etched into it that execute instructions of a computer program. A CPU has a few different parts responsible for, like doing math and getting data out of memory and controlling what goes where. The key here is it is able to run instructions linearly. That is, it runs one instruction after another. Now, there are tricks and techniques to run a few instructions in parallel and split instructions across a few cores to run at the same time, but for the most part, CPUs are a linear thing.

That’s great and it is really good an useful but in the 1970’s the CPUs that were available weren’t cutting it graphics-wise for arcade games. Engineers at Midway and Taito started building specialized chips that managed the data that needed to be scanned out to the monitor. They worked and through the 80’s new specialized graphics chips were able to do more things, like draw lines and fill in areas with color. The 90’s saw them evolve into real time 3D graphics accelerators. Guess what is one of the core mechanics of any 3D graphics? Matrices of numbers and tons of real-time matrix transformations. That means you have to do a ton of calculations simultaneously to move objects around and display them on a screen. You need parallel processing. In 1994, Sony coined the term “GPU” in reference to the graphics chip inside the first PlayStation. The market asked and engineers delivered.

In the early 2000’s, Nvidia created the first chip that was capable of programmable “shaders.” A shader is a program that calculates what value each pixel on your screen has to be to represent a 3D scene. The process is called “shading.” Anyway, what this means is Nvidia made it so you could write a short program that could take some inputs, like textures and combine it with 3D shapes and project that onto the screen, all within the graphics card, quickly, and massively parallel. I was around back then, it was mind-blowing. In fact, it blew my mind to the point that I never could get a shader to work in my 3D graphics class. Oh well.

In late 2006, Nvidia released the GeForce 8 series of graphics cards which introduced “general stream processing units” that allowed people to write highly parallel code that executed on the graphics card. To do this, your code would actually convert your data to texture maps (a computer graphics thing) and would execute algorithms (like to predict the stock market or whatever) by drawing a triangle or quad with an appropriate pixel shader. You literally had to shoe-horn your compute algorithms into something the graphics card could process. Which, you know, is dumb, but it worked.

Since then Nvidia and other companies developed more accessible and generalized access to the compute power on their GPU’s… So, things are better now.

What does this have to do with AI? It is simple. #

If you’ve been following along with my articles, you’ll know that AI is really just taking a bunch of numbers, feeding them into a thing that computes a ton of simple functions (usually the slope-intercept equation of a line) and spits out another set of numbers that, to be honest, seem like pure frickin’ magic. Those input numbers could represent a question asked in english and the output could be a profound answer to that question. Those input numbers could be a recording of you singing a song you wrote and the output could be audio data of Wayne Newton singing that song.

To do either of those things, you have to do a ton of computation… and to make it even half way usable, you have to do those computations in parallel. The CPU in your device can’t handle that type of computation, so it offloads it to a GPU (usually in the form of a graphics card) which happily can do a bazillion calculations a second.

So, yeah, that’s why AI needs graphics cards. Hope that makes sense.

the only thing worse than webinars is when someone writes something like, “so many of you have asked me about…” I mean, come on, no one is asking you anything.