Inside the second - changing how GPU's are benchmarked

Zezda · March 2013

Hey everyone

I don't usually post much these days on the forums but I've not seen anyone mention this before and It's going to be quite important (if it isn't a big deal already).

Basically the gist of it is that since.. well.. just about ever graphics card performance has often been measured in FPS (Frames Per Second) by the people running the tests. While this is a great thing and can help show the differences between cards it doesn't really address the subjective aspect of gaming. I suppose it mostly started with people noticing 'micro-stutter' with multi-gpu setups but it also happens to a lesser extent for all cards regardless of configuration due to the way things rendered and then displayed.

I fully recommend giving these articles a thorough read and really take in the information provided. The thing that makes me excited about this direction is that if the journalists are going to start measuring performance in this way it gives us two things. First of all we can see the general performance of cards (FPS) and secondly we can see how stable/consistent the cards are. This puts a lot more focus on the software driving the graphics cards themselves and really it will force Nvidia and AMD to make sure that their products overall are better supported and are of a higher quality. The consistency part of that matters a lot more than people give credit; it's why I couldn't play Crysis 2 on three monitors even although my FPS never went below 30. It's why people have probably bought a card thinking it can hit the FPS needed but their in-game experience tells them something is wrong. By making sure the cards are more accurately benchmarked we, as the consumer, are getting a better picture of what we are buying into and at the end of the day that's good for us and the market as it will push the vendors to release better products as we become a bit more informed toward what we are purchasing.

Anyway, here are the articles I mentioned. Anyone who plays games on a PC should be reading this to understand the issue and what is being done about it.

http://techreport.com/review/21516/inside-the-second-a-new-look-at-game-benchmarking - First article

http://techreport.com/review/24553/inside-the-second-with-nvidia-frame-capture-tools - Most recent

I'm sure we all get our information from different websides but the two I have been most pleased with recently have been Anandtech and Techreport. Between both of them they usually cover most things I'm interested.

Quizzical · March 2013

Interesting stuff, but they're wandering way out into the weeds in data that is hard to even define, let alone measure. The basic problem is, how do you define how long a frame is? At what point do you say, this is the start of a frame and the time that the frame takes is from this point in this frame to the same point in the next frame?

In my game engine, there are a bunch of different things that you could point to. Is it the moment that the video card finishes rendering a frame and is ready to send it to the monitor? Is it when the rendering thread starts the display() function? How about when it finishes the display() function? When other threads inform the rendering thread that all of the CPU work is done and stuff just needs to be sent to the processor? When the rendering thread starts handling the between-frames maintenance such as sending new textures to the video card? How about when it ends that between-frames maintenance? When the rendering thread informs another thread that it is done with a frame so the CPU-side work for the next frame can start?

I think that the best answer is probably when the game engine checks the current time to determine how far to increment the state of the game world before drawing the next frame. The problem is that neither Fraps nor FCAT have any clue about that. Fraps will tend to be closer to that value, as there's less other stuff between what Fraps can measure and what FCAT can measure.

Meanwhile, that's something that is completely trivial for a game engine to measure and record. All that you'd need to do is to set up an array to store the times, a counter variable that keeps track of which frame is being timed, and a single line of code in the game engine to copy the frame time that it already has to determine for other purposes into that array. And then you'd need some code to output the contents of the array to a file or some such when you're done measuring.

Actually, the exact source code that I use to output my current frame rate in my own game engine is:

System.out.println("Frame rate: " + 1000000000l / (timestamp - oldFrameTime)); ////

One line, and not even an excessively complicated line. The "timestamp" and "oldFrameTime" variables are defined and used elsewhere, for reasons unrelated to computing my frame rate. The "////" is only there to make it easy to find that particular line by the Eclipse find/replace function, since I comment it out and then enable it again a lot. You do need to have the console available to read that, though, and it does scroll off pretty fast, making it inappropriate for measuring frame times over the course of a minute rather than a few seconds.

Zezda · March 2013

Originally posted by Quizzical
Interesting stuff, but they're wandering way out into the weeds in data that is hard to even define, let alone measure. The basic problem is, how do you define how long a frame is? At what point do you say, this is the start of a frame and the time that the frame takes is from this point in this frame to the same point in the next frame?

In my game engine, there are a bunch of different things that you could point to. Is it the moment that the video card finishes rendering a frame and is ready to send it to the monitor? Is it when the rendering thread starts the display() function? How about when it finishes the display() function? When other threads inform the rendering thread that all of the CPU work is done and stuff just needs to be sent to the processor? When the rendering thread starts handling the between-frames maintenance such as sending new textures to the video card? How about when it ends that between-frames maintenance? When the rendering thread informs another thread that it is done with a frame so the CPU-side work for the next frame can start?

I think that the best answer is probably when the game engine checks the current time to determine how far to increment the state of the game world before drawing the next frame. The problem is that neither Fraps nor FCAT have any clue about that. Fraps will tend to be closer to that value, as there's less other stuff between what Fraps can measure and what FCAT can measure.

Meanwhile, that's something that is completely trivial for a game engine to measure and record. All that you'd need to do is to set up an array to store the times, a counter variable that keeps track of which frame is being timed, and a single line of code in the game engine to copy the frame time that it already has to determine for other purposes into that array. And then you'd need some code to output the contents of the array to a file or some such when you're done measuring.

Actually, the exact source code that I use to output my current frame rate in my own game engine is:

System.out.println("Frame rate: " + 1000000000l / (timestamp - oldFrameTime)); ////

One line, and not even an excessively complicated line. The "timestamp" and "oldFrameTime" variables are defined and used elsewhere, for reasons unrelated to computing my frame rate. The "////" is only there to make it easy to find that particular line by the Eclipse find/replace function, since I comment it out and then enable it again a lot. You do need to have the console available to read that, though, and it does scroll off pretty fast, making it inappropriate for measuring frame times over the course of a minute rather than a few seconds.

With you completely. In an ideal world we would be able to peer into the process at any point along the pipeline but we just can't with the tools available to us.

Still, any tool that helps us benchmark what is happening at any point along the line rather than having a single point of reference helps us build up a better picture not just of how the hardware performs but how the drivers and possibly even the game engine itself can influence things. All in all I think it's good that more attention is being paid to what they are measuring and the understanding that a lot can happen in one second.

[EDIT] One example, along the same vein as what you were talking about with the games processing time intervals, (If I'm correct) is Skyrim. From what I remember hearing the timing that the engine keeps can get messed up quite easily if you uncap the frame limiter and it can mess up the NPC's and AI in general as well. It certainly doesn't seem impossible that it could perhaps be messing with the engine in a more physical sense as well.

Cleffy · March 2013

Using something like VSync helps with microstutter since it controls the flow of frames. How do you determine the time between frames? You can always setup a wait system to render a frame every 16ms, and only blitter to the backbuffer once 16ms is reached. To me designing a game around the experience is more important than reaching benchmark results. Companies target getting their game running at a certain fps so artificially limiting it with a timer effectively controls a nice even flow of frames. Then ofcourse also making sure they don't somehow overload the calculations done in any one frame by doing something like delaying the loading of a character a frame or two in order to display the frame ontime.

Quizzical · March 2013

Originally posted by Cleffy
Using something like VSync helps with microstutter since it controls the flow of frames. How do you determine the time between frames? You can always setup a wait system to render a frame every 16ms, and only blitter to the backbuffer once 16ms is reached. To me designing a game around the experience is more important than reaching benchmark results. Companies target getting their game running at a certain fps so artificially limiting it with a timer effectively controls a nice even flow of frames. Then ofcourse also making sure they don't somehow overload the calculations done in any one frame by doing something like delaying the loading of a character a frame or two in order to display the frame ontime.

It's not that simple.

http://www.mmorpg.com/discussion2.cfm/thread/374453

elohssa · March 2013

Originally posted by Zezda
Hey everyone

I don't usually post much these days on the forums but I've not seen anyone mention this before and It's going to be quite important (if it isn't a big deal already).

Basically the gist of it is that since.. well.. just about ever graphics card performance has often been measured in FPS (Frames Per Second) by the people running the tests. While this is a great thing and can help show the differences between cards it doesn't really address the subjective aspect of gaming. I suppose it mostly started with people noticing 'micro-stutter' with multi-gpu setups but it also happens to a lesser extent for all cards regardless of configuration due to the way things rendered and then displayed.

I fully recommend giving these articles a thorough read and really take in the information provided. The thing that makes me excited about this direction is that if the journalists are going to start measuring performance in this way it gives us two things. First of all we can see the general performance of cards (FPS) and secondly we can see how stable/consistent the cards are. This puts a lot more focus on the software driving the graphics cards themselves and really it will force Nvidia and AMD to make sure that their products overall are better supported and are of a higher quality. The consistency part of that matters a lot more than people give credit; it's why I couldn't play Crysis 2 on three monitors even although my FPS never went below 30. It's why people have probably bought a card thinking it can hit the FPS needed but their in-game experience tells them something is wrong. By making sure the cards are more accurately benchmarked we, as the consumer, are getting a better picture of what we are buying into and at the end of the day that's good for us and the market as it will push the vendors to release better products as we become a bit more informed toward what we are purchasing.

Anyway, here are the articles I mentioned. Anyone who plays games on a PC should be reading this to understand the issue and what is being done about it.

http://techreport.com/review/21516/inside-the-second-a-new-look-at-game-benchmarking - First article

http://techreport.com/review/24553/inside-the-second-with-nvidia-frame-capture-tools - Most recent

I'm sure we all get our information from different websides but the two I have been most pleased with recently have been Anandtech and Techreport. Between both of them they usually cover most things I'm interested.

Its a bogus testing method developed by nvidia to make their cards outbench AMD. Thats not to say its worthless data, but there is a huge deal of bias with the tests.

There is nothing really wrong with the old approach of min-max & avg fps. Its the most simplistic, and the most meaningful to actual gameplay.

Quizzical · March 2013

Originally posted by elohssa
Its a bogus testing method developed by nvidia to make their cards outbench AMD. Thats not to say its worthless data, but there is a huge deal of bias with the tests.

There is nothing really wrong with the old approach of min-max & avg fps. Its the most simplistic, and the most meaningful to actual gameplay.

Actually, there is a lot wrong with the old approach of min, max, and average frame rates. It's unable to catch spikes in which one frame takes a very long time, and those are hugely disruptive to gameplay.

While Nvidia is likely pushing this approach because they figured out that they should start optimizing drivers for it before AMD did, it is meaningful information.

I think it's interesting that as measured by when frames actually complete, for a single card, both AMD and Nvidia provide very smooth frame rates almost everywhere. The real problem is that the game engine thinks that some frames take much longer than others. FRAPS is closer to being able to measure that than Nvidia's tools, but there are still several layers of noise between what FRAPS can measure and what the game engine thinks is going on internally.

How much the video drivers let the rendering thread queue up rendering commands before telling it to stop and wait has a considerable impact on spikes where the game engine thinks that one frame takes much longer. There are also major issues with when the game engine decides to do extra, non-rendering work, such as buffering new data (most commonly textures) on the video card. How drivers handle that could easily account for most or even all of the the spikes in the data.

Actually, I wonder if the spikes are caused by how buffering new data changes the amount of time in advance that the video drivers let the rendering thread queue up commands. Ordinary rendering work such as passing uniforms or rendering commands to the video card or switching textures or vertex arrays will likely leave about the same amount of work in the queue at the end of each frame when the game engine thinks that it's time to start another. Video drivers probably don't have a good handle on how much time sending new textures and creating mipmaps for them takes compared to ordinary rendering commands.

Recompiling shaders and relinking programs will also give you a huge spike in your frame rates, but that's generally done when the game is launched, or possibly when a user changes certain graphical settings or at loading screens. (I have no idea how prevalent taking loading screens or graphical setting changes as a time to recompile shaders is, but those are the times when one could do that without it striking me as completely stupid.) It's very unlikely that that will be done in the middle of a benchmark run.

And there's also the problem that a bad game engine can create unnecessary spikes. If you finish one frame and want to pass 50 new textures to the video card and create mipmaps for them before starting the next frame, then you're going to get a spike in your frame rates and there's nothing AMD or Nvidia can do about it. That's fine at loading screens, of course, but if you're going to do that while the player is playing the game, you have to spread it out and not do too much in a single frame.

Ridelynn · March 2013

I've been over canned benchmark tools for a long time now.

Once you get past the base specs (clock speed, core count, FLOPS/IOPS/MIPS/whatever, memory bandwidth, etc) the rest is mainly up to driver and architectural decisions, and vary widely game to game.

I like how some companies release a canned demo loop so you can "benchmark" your system, that does help some by eliminating a lot of variables, but that benchmark is just relative to that particular game - it doesn't say too much about your hardware over all, and can only loosely correlate to performance with other software.

The ultimate "benchmark" is how it looks and performs to you - the consumer. I think the best method may just be HardOCP's testing: they take a game and will give you an Apples to Apples comparison on different hardware, but they will also give you a "Best Performance" settings, which can vary widely on different cards, and even on different drivers. They also post FPS graphs over time - not perfect, but far better than just a Min/Max/Average - you can see where the big stalls occur, and how they line up to other hardware on the same software load. And their editors are good about putting in little things they may notice that may not show up on benchmark numbers (microstuddering, artifacting, clipping issues, etc).

Howdy, Stranger!

Inside the second - changing how GPU's are benchmarked

Comments

Howdy, Stranger!

Quick Links

Inside the second - changing how GPU's are benchmarked

Comments