Do you know what the root problem is with 3DMark2003? It’s not the benchmarkers, it’s the concept of the all-purpose benchmark.
You cannot please everyone with a single number. It is not that 3DMark failed to do something. Nobody can do this.
And if you think it can be done, you’re part of the problem.
You cannot come up with a benchmark that will please every rational person (forget the irrational ones). This is because different people want not only different things from a benchmark, they want contradictory things.
To give just one small example, some people want a benchmark that gives them an idea on how well a video card will do now. Others want a benchmark that gives them an idea on how well a video card will do in the future. You cannot do both with a single benchmark number. You cannot represent two much different measurements with one number.
You can’t even come up with a single number for current games. That’s because there isn’t a single way to program all games. Different games use different game engines, and they behave very differently.
People don’t have identical systems outside the video card, either. Different CPUs at different speeds and different FSBs have much different effects on different game engines.
A Muddled Number For Muddled Heads
Imagine a U.S. weather forecast that gave the average temperature in the United States.
And nothing else. No local temperatures. Alaska gets told the same U.S. temperature as Florida; Arizona gets the same temperature as Maine.
Would you want that? No, you wouldn’t because it would be a muddle. Perfectly accurate and perfectly useless to you. You’d call anybody who demanded that retarded.
OK, let’s improve that. Instead of a synthetic U.S. average, we’ll give the real temperature for Bismarck, North Dakota, or any other single city you like.
Oh, you don’t like that, either? OK, we’ll average out Bismarck, LA, St. Louis, Miami and Boston, and give the whole country that. Is that better?
Wouldn’t you find that just as retarded?
But this is exactly what people are wanting and demanding when they want a single all-purpose number, and it doesn’t matter whether it’s “synthetic” or “natural.” The only difference is between being synthetically stupid or naturally stupid. Just different ways of being dumb about this.
Now I could write a fifty-page article documentating this, but I know that no matter what I or anybody else says or does, a whole lot of people will not accept this. (We’ll explain why later, and why that hurts everyone.)
Different People, Same Results
Since it’s impossible to come up with a single useful all purpose number, it really doesn’t matter who comes up with it or what they do. The fault is not with the creator but the concept.
It doesn’t even matter from the standpoint of keeping video card makers from spending a lot of time and effort overly optimizing for the test. Once you establish a limited test to create a “number,” any number, they will optimize for the test, any test.
It doesn’t matter who comes up with the benchmark, or how good or bad their intentions are. Kyle Bennett, Ed Stroligo, anybody else, it doesn’t matter. Even Jesus Christ could come down and come up with this benchmark, video card makers would still screw with it. Once you establish anything as a major or sole criteria for judgment, and enough people believe it, manufacturers will suck up to it.
So What’s The Answer?
The answer is to stop chasing the delusion of an all-purpose number. That has to come first. There is no answer if you don’t stop doing that, just different delusions.
Once you do that, then you can do something about it.
First, you choose games that are representative of popularly played games. For instance, many games use the same game engine which generally follow the same performance pattern. So you use commonly-used game engines.
Second, you test at enough CPU speeds to roughly determine how much impact different speeds have on performance. You don’t have to test every speed grade, three or four will do fine.
Third, you test different FSB speeds enough to see whether or not that impacts performance.
Fourth, and most importantly, you don’t try to wrap this up in a single number. You give people the numbers and let them judge the plusses and minuses of a particular selection of equipment for the games they play.
This will create a lot of data, so you don’t try to satisfy the literal minded. For instance, there’s no need to test every PIV speed grade, three or even two will do for a range of processors that aren’t any different besides speed.
Nor do you need a graph for everything you test. For instance, if a game goes hardly any faster with a PIV at 200MHz FSB than at 133MHz, you don’t need a graph to show that. Just say, “increasing FSB doesn’t do anything for this game.”
The Core Problem: Functional Illiteracy
Of all the folks who do this sort of thing, Anandtech seems to have at least more of the core requirements in place, though there would have to be major revisions of the methodology to implement this.
But do you know what? If they did it, it probably wouldn’t improve matters much because you can lead a horse to water, but you can’t make him drink it.
In this case, you can’t get a lot of the horses to take more than a sip or two.
This is functional illiteracy, just in a different form.
What is functional illiteracy? It is the inability to read enough to function properly in a given environment.
It’s usually meant to define people who can’t read enough to function properly. Here, it’s a matter of people who won’t read enough to function properly.
In either case, you get the same result. Actually, this is worse than true illiteracy. Laziness is much harder to treat than illiteracy.
You may say, “So what? Not my problem,” but it is. If there are enough functional illiterates out there buying product and hanging on a single particular number, they’re the ones who determine what the video card manufacturers will do to satisfy them.
The more puppets there are, the more powerful the puppeteer.