Nvidia launched their Turing architecture about a year and a half ago. In addition to the traditional functionality of GPUs, their higher end Turing cards also used quite a bit of silicon for two new things: ray tracing and tensor cores. Real-time ray tracing has some merit, but is not the point of this thread.
The idea of tensor cores is that it was basically a dedicated portion of the chip to accelerate certain types of machine learning. Using part of a GPU for some dedicated purpose like this is hardly a new idea. GPUs have had video decode blocks to accelerate video decoding (e.g., watching videos on YouTube) for many years now. But while that offers obvious benefits to consumers, tensor cores for machine learning do not.
Furthermore, adding the tensor cores wasn't just a freebie. It takes a lot of die space, and that bloats the cost of building the dies. That makes the resulting video cards significantly more expensive to build, and so they would have to sell for more at retail in order to be profitable. So basically, Nvidia had to come up with some excuse as to why gamers should be willing to pay extra for a feature that is useless to them.
I suspect that the real reason Nvidia put tensor cores into the cards is for the sake of Tesla cards that would be based on the same GPU chips, especially the Tesla T4. Consumers were never the target market, which is why the lower end Turing cards that were intended for graphics and other consumer uses didn't get the tensor cores. But convincing gamers that it was worth paying extra for was still a software and marketing problem.
Nvidia's answer to this was Deep Learning Super Sampling, or DLSS. The idea was that you could render a game at a lower resolution, then use machine learning to upscale it to the resolution you wanted. This would make heavy use of the tensor cores, so gamers could get some use out of them. The resulting image quality wouldn't be as good as just rendering at the native resolution. But it would be cheaper to do, and so it would run faster.
DLSS was a complete disaster. The basic problem is that applying DLSS takes a lot of computational work. Instead of rendering a game at one resolution and then using DLSS, you could get the same frame rates by rendering it at a significantly higher resolution and then using traditional upscaling. The latter resulted in much better image quality than the former, leaving DLSS completely useless.
The underlying reason for this is what I said when DLSS was first announced: there is no substitute for having enough samples. Approximations and interpolation can never hope to compete with actually knowing the underlying ground truth. The question was never whether DLSS would degrade image quality, but only how badly. Most people don't buy a $1200 video card and a 4K monitor with the aim of running games at not very high graphical settings. And make no mistake: no matter how computationally intensive it is, the image quality with DLSS on is that of not very high graphical settings.
Back when Turing was launched, Nvidia also announced DLSS 2X. The idea is that, unlike plain DLSS, you would start by rendering a game at the native resolution you wanted. Then you would use machine learning to try to improve the image. Unlike plain DLSS, it seemed plausible that this could be useful. After all, there have been a number of other algorithms that did this, including Nvidia's own FXAA, SMAA, and most recently, AMD's Radeon Image Sharpening. And to my eyes at least, FXAA tends to be a net benefit.
A year and a half later, it looks like DLSS 2X is dead. As best as I'm aware, no game has ever bothered. It's possible that game developers implemented it and decided it was useless. That should have been the end of DLSS.
But Turing is still Nvidia's latest GPU architecture. And it's still expensive, because the die sizes are still enormous. And it still has tensor cores, so Nvidia still wants to convince gamers to care about them. And so they are today announcing DLSS 2.0.
Like DLSS, the idea of DLSS 2.0 is to start by rendering a game at a lower resolution, then use machine learning to upscale it. Unlike DLSS, it doesn't just use the lower resolution image, but also some motion vectors to have information on which way things are moving. Those motion vectors can be recorded cheaply when rendering the original image, so getting them isn't a huge computational burden, though it does require substantial additional programming work.
The problem is that relying on motion vectors will give you garbage when there is acceleration involved--whether due to the object accelerating or the camera. That can easily result in image quality that is wildly wrong. So basically, DLSS 2.0 will have a lot of the problems of TAA, which Nvidia quite rightly threw shade on when launching the original DLSS.
DLSS 2.0 could plausibly be less bad than DLSS. But it's still going to be useless. I'm hoping that Nvidia's next generation of GeForce cards will drop the tensor cores so that they can stop this foolishness and instead put their efforts into something more promising, like ray tracing.
Comments
And it will be either a AMD praise piece, or some Intel or NVidia FUD piece. The latter this time it seems.
Nothing to see here. Move along.
https://www.youtube.com/watch?v=h2rhMbQnmrE&t=5s
https://www.youtube.com/watch?v=Uu_no9zX068