It looks like you're new here. If you want to get involved, click one of these buttons!
Just shy of seven years ago, Nvidia launched CUDA. No longer would video card performance just be about games, they said; soon we would offload other work onto the video card. And there was reason to believe that this could be useful: the GeForce 8800 GTX had 128 shaders; for comparison, the first quad core CPUs had just launched. Video cards had vastly more computational power available, and Nvidia was going to make it available to other programs without having to cram everything into a graphics pipeline.
If you've never used CUDA for anything, you're not alone. Today, all but the most die-hard fanboys have been disabused of the notion that CUDA would ever matter for consumer use. The problem is that all that GPU power was simply hard to put to good use for anything other than games. In order to see major gains from GPU compute, you needed algorithms that, among other things:
1) are embarrassingly parallel (i.e., can have an enormous number of threads operate independently with no knowledge of what is going on in other threads),
2) use almost exclusively 32-bit floating point operations,
3) are very, very SIMD-friendly,
4) can do many, many computations for each time accessing video memory,
5) didn't need to communicate with the CPU very much (relative to the amount of computations),
6) involved little to no branching, and
7) actually had a ton of computational work that needed to be done (because CUDA is pointless if something is very fast on a simple CPU).
That rules out just about every program that you've ever used. The one widely-used thing that fits all of them is the GPU portion of 3D rendering--i.e., games.
To be fair, there were some other things that could use CUDA. Such as synthetic benchmarks, or tech demos. Or stuff like Folding @home, which may offer some benefit to society but really didn't offer any benefit to the particular person running it. That stuff like these are the best examples of "consumer" uses of CUDA illustrates the problem.
Today, CUDA is basically obsolete for consumer use, too. It's restricted to Nvidia GPUs only, so if you do have an algorithm that is GPU-compute friendly and don't want to eliminate a large fraction of your customer base by going with a proprietary API, you use OpenCL instead.
That's not to say that CUDA is completely irrelevant. The proprietary nature of CUDA isn't a problem in supercomputers, where you can buy a zillion of card X and then write code that targets the particular card you bought. If you're building a supercomputer, you probably have some specific intended use for it and can pick the parts that best fit your specific use. But that's not something that random consumers do.
Ironically enough, in 2011, someone came up with a new class of program that random consumers could benefit from running and would satisfy all of the conditions above except for #2: bitcoin mining. And instead of #2, it could use almost exclusively 32-bit integer operations. (See, the title of this thread is actually relevant to the thread.)
Then people discovered that Radeon cards were good at integer operations and GeForce cards weren't. Or perhaps rather, Radeon cards were much better at integer operations than GeForce cards; all video cards were massively better at floating point operations than integer operations. More directly, people discovered that Radeon cards were vastly better at bitcoin mining than GeForce cards, even if people didn't realize that integer performance was the reason.
The reason for video cards to focus on floating point performance is that 3D graphics (i.e., what people buy video cards for) uses floating point operations for just about everything. Translations, rotations, lighting, and just about everything else that you can think of works wonderfully with floating point computations and badly or not at all with integer computations. Furthermore, some "small" integer computations can be done just as well with floating point data types: a 23-bit mantissa is enough to add, subtract, and multiply (but not divide!) integers up into the millions with no loss of precision when represented as floating point data types. While OpenGL (and probably also DirectX) makes available both floating point and integer data types, games use almost exclusively floating point.
The only graphical use for integer data that I've come up with is as a quick and dirty "random" number generator: take a random, unsigned integer in some interval, multiply it by a fixed, large, odd number (e.g., in the hundreds of millions), and then treat the output as though it is uniformly distributed in [0, 2^32). If it's not obvious why you would want to do that, that's kind of the point, though I've found it useful to do some geometry computations for particle effects.
But if lots of programs are going to be offloaded onto a GPU, then integer computations are going to matter. AMD is pushing for this, and is making architectural changes (such as having the CPU in the same chip as the GPU) to take #5 off of the requirements list above. Solid GPU integer performance strikes #2 off of the list, too. But Nvidia can't put the GPU in the same chip as the CPU apart from Tegra--and if you need to do intense computations, using a tablet or cell phone for it is doing it wrong.
So ironically enough, after Nvidia had pushed for GPU compute in the consumer space for years, the first consumer application that pushed people to buy video cards for GPU compute pushed people to buy exclusively AMD cards because it needed integer computations, not floating point. That has probably helped AMD's bottom line a bit, but it's still a long, long way from rivaling games as a reason to buy a video card.
But what about the future? Will GPU integer performance matter? Judging by current architectures, AMD is betting that it will and Nvidia is betting that it won't. To say that AMD is betting the company on this is a little too strong, but only a little: if GPU integer performance doesn't matter to consumers, then it's unlikely that heterogeneous computing will ever matter to consumers. And heterogeneous computing (i.e., having programs that traditionally would use only the CPU offload a lot of work to the GPU) is AMD's only real plan for having a CPU that can beat Intel's in the next several years.
But if AMD is right, then we could start seeing programs where a quad core AMD APU completely destroys Intel's best quad core CPU, if the AMD chip can efficiently have the GPU do the bulk of the work that Intel has to do on the CPU. I'd expect to see a trickle of such programs in coming years, but don't expect it to ever be all that widespread. Even so, AMD doesn't need for all or even most programs to see major gains from heterogeneous computing; the only programs that matter are those were some CPUs aren't good enough.