Got this email, this person said what some are probably thinking:
I am about to network my two computers, and I am an
overclocker so I was eagerly awaiting the NIC survey
results. I’m sorry to say I was disappointed.
Could you please update the article and post some hard
data to augment the generalities? Hard information
like the average FSB reached with Netgear (and other
brand) cards? Percentage (by brand) of those who
couldn’t overclock their cards at all? Highest speed
reached by brand with cards? Etc.
There is a group of people who think a number is better than an assessment. Numbers don’t lie.
Like hell they don’t.
Could I have provided this “hard” data? Sure. And it would have all been “wrong” data.
This Is Not A Lab
Do you how I could have given this guy good “hard data”? Buy fifty or a hundred cards each of the twenty or
so brands that got mentioned. Test each of them on a system previously proven to run at, say, 170Mhz. Then I could give
you wonderful “hard” meaningful statistical data, by about spring 2003.
That’s not an option. So instead, we ask a lot of you to tell us how you do with your system. No controls for everything except one factor; you’re not lab rats.
Do you know what most of you told me? “Hi, Ed, I reached XXX Mhz. The NIC has never given me a problem at that speed, and I don’t think it’s what’s stopping me from going any further.”
I don’t doubt that assessment was correct most of the time. Many times, you knew what was keeping you from going faster.
Let’s say you’re pretty sure your RAM is keeping you to just 140Mhz. Let’s say you’re absolutely right. Is that the NICs fault? Of course not.
But if I gave you an average, what I’m effectively doing is saying that it’s the NICs fault.
Now you might say those quirks would even out, and sure, if I had a couple thousand examples of each card, they probably would. We don’t. Didn’t have more than a couple dozen of any one card, and usually less.
So sure, I could come up with an average, but if one card did better than another, it would almost certainly be due to reasons that had nothing to do with the NIC.
Averages Can Lie
Here are the results of card A:
Here are the results of card B:
All else being equal, it would seem like card A performs pretty consistently, while card B is all over the place. But guess what? They both average out to be exactly the same.
Does an average help you, or mislead you?
No NIC card looked like the pattern of card A. The patterns were much closer to card B. Even the Netgear cards had examples of cards running at 40Mhz or better. Every card with a decent sampling always had a black sheep or two, and also had people who for whatever reason didn’t push their systems to 165Mhz.
Very few people reported not being able to overclock at all. Reporting the highest speed doesn’t tell you anything other than somebody got lucky, and worse, the area where they got lucky probably wasn’t in the area of the NIC.
If You Want It Done Badly, That’s Just What You’re Going To Get
You don’t make data statistically valid by applying statistical formulas to it. It’s the nature of the data that makes the statistical procedure valid; the statistical procedure can’t make statistically invalid data valid.
If you try, all you get is “Garbage In, Garbage Out.” And I’m not going to serve you precise-looking statistics when I know they’re garbage.
When I asked the question, I honestly thought NICs were a problem, and I was going to hear about a lot of bottlenecks caused by them. If that had been the case, then I would have had a shot of presenting some half-hard data.
But that didn’t happen. Most of you did not get the opportunity to push your NIC to the breaking point. You ran your system at 112 or 120 or 124 or 133 or 145 or 150 or 155 or 160 or 165Mhz, and it didn’t give you a problem.
You can’t take the same approach when the NIC generally doesn’t cause bottlenecks that you would take if it did. You have to treat the data differently.
In this particular case, what was far more indicative of how cards did were your comments about them than any particular numbers.
In most cases, the cards you described as a problem were long gone; you went out and got something that did work.
In this particular situation, there were almost as many complaints about Netgear cards you had personally owned than about all the other cards combined. Statistically valid? No. Indicative, yes.
Something Is Better Than Nothing
I’ve already explained why I’m not going to hand out numbers I know won’t give you a proper and good idea of what’s going on.
However, this is not a bipolar situation: “hard” data or nothing. After getting the data I got, it would have been equally inaccurate and improper to just say, “I can’t determine anything.”
Statistical validity is a pretty high standard. To get it, you either need a lot of samples, or there has to be a big inherent difference between the two items being mentioned (let’s leave random selection aside).
The sort of information we get from these surveys do not usually lend themselves to rigorous statistical treatment. However, that does not mean you can’t learn anything from it.
From a statistical “hard data” point of view, I got lemons (but I bet you’re pretty happy your NIC card works fine overclocked). So I made lemonade. Better lemonade than bulls***.
Blind Faith In Numbers Is Still Blind
I know there’s a type of person out there who only wants to deal with “hard data” thinking that it’s “better.” He thinks that’s a smart thing to do.
If you think that, I have a bridge to sell you, and I’d have absolutely no problem selling it to you.
You can hide a lot of lies in a single number, and if you don’t know that, you are begging to be taken.
I remember getting a spreadsheet once. Lots of apparently good, solid hard numbers. By the time I got finished finding out what those numbers really meant, I estimated there was one serious lie per square inch hidden in those “hard” numbers.
So tell me about how good numbers are. 🙂
A number is an abstraction. It doesn’t tell you how it was made. It doesn’t tell you why it is. It may reflect reality, or be a travesty of it, mixing apples, oranges, bananas and grapefruit.
The person who wanted numbers probably thought he was going to get better information in the form of a few numbers. Actually, what he wanted was worse than nothing at all.
Knowing a number means nothing. Understanding a number, the how and where and why, the limitations of that number and to be able to tell the difference between a good one or a bad one, means a lot.
The U.S. Census just told us that America is 12% black. That does not prove I have an African-American great-grandparent.
Why I Wrote This
1) I wanted to point out that a common way of thinking can often be counterproductive because statistics can often tell you the opposite of what you thought they would tell you and
2) When we do surveys of other pieces of equipment, the evidence is likely to be even less conclusive than the NIC survey. It would be wonderful if we had armies of testers and mountains of equipment and test under lab conditions.
We don’t and we can’t. Nobody can. Just running a reasonably solid NIC test would have cost us tens of thousands of dollars in equipment alone, and a couple man-years worth of testing.
Nobody is going to or can do that. So we’re doing the next best thing, and getting whatever we can out of it.