The link we posted yesterday about the KT7 and its RMA rates brings up issues we’ve been long troubled by at Overclockers.com.
Websites get equipment, often hand-selected equipment by the manufacturer, on which to do a review. Is that equipment necessarily representative of the equipment you’re going to get? Maybe yes, maybe no.
How long does a site test a piece of equipment? If it’s a site that does a lot of hardware reviews, probably not very long, after all, there’s more coming all the time.
But even if it isn’t, “install it, run the benches, and get out,” how long is long enough? A few days? A couple weeks? Longer? How much equipment could anybody possibly test if everything got subjected to long-term tryouts? How long would you be willing to wait until they were through?
Finally, no matter how long the test takes or how conscientious the review is, the reviewer only has one sample. The manufacturer will make at least umpteen thousands, in some cases maybe even millions of an item. Can you predict how all those units are going to behave based on your one?
Mass Production Doesn’t Yield Mass Results
If dealing with computer hardware has taught me anything, it’s taught me the real consequences of variation in a mass-produced item.
We’ve seen CPUs, literally sitting right next to each other when they were being made, and tested on the same platform, perform quite differently when overclocked.
In gathering my surveys, I’ve seen components act rather differently in very similiar surroundings. Sure, there are more similarities than differences, but still some pretty big differences in performance.
I’ve read countless posts about people with problems with their computer equipment, and while many can be explained by the user not taking the right steps, there’s still some that appear inexplicable.
You may say, “Well, DUH, some things are broken.” Thank you for that brilliant insight :), but often, it’s not a matter of being broken, but performing above spec, but not at the expected “spec.”
Others may say, “You have to expect this, it’s just statistical variation.” No denying that, but how do you predict statistical variation on a sample of one? You can’t. More importantly, how can you gauge and present such statistical variation to somebody just buying one of the items?
“They Should All Be The Same; It’s Only Fair”
This is often an underlying doctrine behind the buying decisions and later complaints of people. I get many letters from people saying, “Well, I bought _____ because somebody else got _____ out of it.” Later on, I see, “Well, they got it, why can’t I?”
Don’t ever expect statistical variation to be fair. Especially don’t expect it to be fair when you look at the top end of a statistical gathering like our database, and presume, like Garrison Keillor, that all your components will be above average. 🙂
What statistical variation means is that most will fall around a norm, with some below that norm, and some above it. Just because one or some or even most reach a certain level does not mean every one will reach that level, or, more importantly for you, that yours will.
Unless you set your sights very low, you are betting on statistical odds, and the odds exist and work whether you want them to or not.
Just How Bad Is Bad?
I don’t care what it is, in any mass-production product, you’re going to have lemons. But how many lemons do you have to have before the product goes sour?
Let’s take the KT7 as an example. Yesterday, I spent a lot of time reading through newsgroup posts about the KT7. There’s no doubt many of them work fine without massive intervention on the part of the user. There’s no doubt many of them work fine after massive intervention on the part of the user.
There’s just as little doubt that more than a couple isolated ones don’t work fine no matter what the user does. At what point does that become a product you’d rather not touch? Is 5% too high a figure? 10%?
And what about massive intervention? I read one person who said something to the effect of “after spending eight hours reading up on problems and how to solve them, I had no problems with installation.” Should a motherboard require that much effort? Is it reasonable to expect the average motherboard buyer to do that?
I looked at the newsgroups about the Asus A7V, too. There are problems there, too, but not as severe as those reported by the KT7. More than I’d like to see, but on the whole, fewer types of problems, and generally more handleable.
Of course, I could look at any motherboard or other component, and read about some problems some people have with any product. The issue becomes, “At what point does it stop being normal “product problems” and start becoming a “problem product?”
The Limits of Review Sites
This is a problem review sites don’t handle well. While review sites may be more or less ethical about the products they do review, what I’m talking about is an issue a reviewing site can’t handle, no matter how ethical and honorable. Not unless they start testing a truckload of components for a month at a time, and nobody can afford that.
The truth is most review sites have one-night stands or at most short flings with this equipment. Yes, they can identify obvious dogs. Yes, they can identify problems that crop up. But they can’t test for every possible combination of hardware and software in the world, and if 5% of the boards are defective, and they got one of the 95% that aren’t, they’ll never know it.
You, on the other hand, get married to it, and like marriage, subtle and not even not-so-subtle faults emerge you never noticed or got the chance to notice at first become very evident over time.
Relying On Brand Names?
Most people do this in their lives. They do so because a brand name is supposed to indicate a certain level of consistency you can rely on.
Is that a good approach to take when it comes to computer equipment? No. Everybody comes up with dog products. All of them. Even the places that usually do well, and charge a hefty premium because of that perception, occasionally put out substandard products.
The Ethics of Guinea Pigs
Unless manufacturers suddenly start allowing mass testing by some consumer protection group (don’t hold your breath), there’s only one possible source for mass feedback on a product. You, especially those of you who buy right away.
However, that approach relies upon “early implementers,” “pioneers,” whatever euphemism you’d like. Let’s call these folks what they really are.
Yes, they are volunteers. Yes, they are willing. But a part of me is troubled by having to depend upon people willing to work without a safety net. We’ve at least tried to come up with estimates on future processors and what they’re likely to do, but they’re educated guesstimates.
More troublesome from a statistical viewpoint is that the first folks out skew the results. Early implementers seem to know what they’re doing more than the average person. They seem to be more likely to have the latest equipment.
So when they overclock a new processor, they tend to do pretty well with it. When everybody else jumps on after they’ve effectively indicated the coast is clear, the average results drop.
We’ve noticed that repeatedly while analyzing the CPU Database. The first numbers usually look very good. Only later do we hear about people not having such an easy time reaching certain levels.
On the other hand, when it comes to something like motherboards, early adopters run into huge problems. If you based a judgment on the initial product, you’d never buy anything. Some products are notorious for not getting it right until about the fourth BIOS revision, and those are the better ones.
The ones that are worse never do get better.
Another problem with mass feedback is lack of consistency. Different people with different equipment and different skills are trying to do the same thing, and it’s not surprising they get different results.
Let’s look at equipment, for instance. When we started the CPU Database, you essentially had three factors: how fast the CPU could run, the degree of cooling used, and how fast PCI devices could be overclocked.
The Coppermines tossed in some additional items. The type of memory you had became a big factor in whether your processor reached the desired speed or not. In some cases, an overclocked AGP speed for a video card was the determining factor. The type of motherboard became a significant factor.
You keep adding possible limiting factors to the formula, and it gets more and more difficult to figure out what is limiting somebody. I remember cases where people could only get to say, 138Mhz with a system, and the reason why it couldn’t go further might have been due to three or four separate items. Without further information and/or testing, you simply would not know which one or ones caused the reason.
People are another big factor. If I see a substandard result, is it because the equipment is subpar, or the person running it? At what point and on what grounds do you toss results out in any analytical interpretation because you think the tester is an idiot?
At the other extreme, if you have a hypercompetent user who knows every trick in the book, and invented a good number of them, are those results representative of what the average person should expect (knowing that people often just look at the top results in forming their expectations)?
How Much Is Too Much?
I think people like our CPU Database because it’s easy and quick both to enter data and to look at it.
However, for the reasons I’ve pointed out, life has gotten more complicated since the time when we’ve started it. It’s still a good guide, but it could be better.
We could ask for more information, but how many more questions could we ask before you get tired of the whole process and stop filling it out? How many more factors could we go into before you stop reading it?
We get the impression most of you don’t spend a lot of time perusing the numbers and comments we currently have, and wonder if you’d want to look at several times more data if it were available.
We believe many of you would much rather read a quick summary of what you can expect rather than gaze at numbers.
We also think you’d like to know more than performance: reliability and problems with particular items.
We’ve been discussing and working on ways to provide you with this information, but this is really rough.
So We Ask You
I’d like you to send me an email answering the following questions so we can get some guidance on what you do now, and what you’d like to see.
1. When you use the CPU Database, how do you use it? Do you spend a lot of time looking at all the numbers and figures and comments, or not?
2. How important do you find the comments in our current CPU Database in making a buying decision?
3. Assuming we came up with bigger, more inclusive databases covering more areas of the computer than just the CPU, what items would you consider important in making buying decisions? What features would you like to see?
4. How much time and/or how many questions would you be willing to answer to feed the database? How much would be too much in your eyes?
5. If you had to choose between a brief synopsis analyzing the results of the database, or just getting all the raw data to interpret for yourself, which would you prefer to look at most of the time?
6. If you had to choose between two products, and one was the fastest available, but was very quirky, and another was 5% slower, but completely reliable, which would you prefer?
7. What would you like to know about the reliability of a product? Are you interested in how difficult it is to install or to get a product working? At what point would you would be discouraged from buying a product due to reliability?