Why Such Differences?

That’s sort of how I’ve felt reading the reaction and experiences of others
about reliability. Not just people’s experiences with one or a few machines, I
expected that. What’s more interesting is that system builders, who’ve built
hundreds of systems, enough to get a fairly representative sample, are split on this issue, too.

For every person who said, “Based on my personal experience (with one or five or ten or twenty or
hundreds of systems), this is better,” I’ve got someone else with the same experience level who’ll say the opposite.

I don’t doubt that on the whole, most of you on both sides have been honest and truthful. If you get nothing else
out of this, understand that there are sincere, competent people out there who are getting quite different results than
what you’re getting. Are there not-so-sincere, not-so-competent people out reporting different results because they aren’t
so sincere or competent? Sure, but it’s not all of them.

The question you should be asking is not, “Who’s the idiot?” but rather “Why?”, and when you ask “why,” realize that there
isn’t going to be a single answer.

It’s safe to say that much of the reason for different beliefs is that you just can’t pick up relatively similiar statistical
trends based on a small sample. Let’s assume, strictly for argument’s sake, that 15% of Via-based boards and 10% of Intel-based
boards are clearly “defective” (by some agreed upon standard). Nobody can determine that based on testing one or even a handful of boards.
You’d have to test hundreds of boards, maybe more like fifteen hundred setups, before you could say with a good degree of statistical certainty that there’s any difference.

When we get to CPUs, the differences are even tighter. Assume, again for arguments sake, that the “true” failure rate between AMD and Intel CPUs is the difference between 1 and 2%. You’d have to
test many thousands of CPUs before you could definitively show a difference.

(In this particular case, we might realistically be more interested in the “false” failure rate. If 15% of AMD chips get chopped up due to improper heatsink placement, and 5% of Intel chips get chopped up, that’s probably a far more important statistic in real-life than the “true” failure rate.)

In any of these cases, can we say that there is an overall reliability difference? Of course we can. However, is that difference really meaningful to a single person buying a single CPU or motherboard? That’s far more arguable, at least under these kinds of circumstances.

For the moment, let’s leave aside how we define “failure.” Assume for argument’s sake that we have such a definition. How do we measure it?

For practical purposes, most individual buyers assume close to 100% reliability unless given direct evidence to the contrary.

Where might they get a notion of this? They can look at reviews, and they can look at user comments. Both approaches have big problems

Reviews of a single product will not and cannot pick up a fractional defective rate. If 90% of Product X is good, and 10% are bad; depending on random chance, the reviewer looking solely at what he can test “scientifically” will probably conclude it’s fine, might conclude it’s bad.

Either answer is “hands-on” and as scientifically controlled as you like, and both answers are dead wrong insofar as any general applicability.

Even in those (relatively few) instances where every single example of a product has some defect, a review cannot possibly find them if they don’t test for it, and most problems that emerge from user experience don’t end up being tested in reviewz.

OK, let’s look at option two, user comments. Users will test all sorts of situations under all sorts of conditions. Because thousands of people simply can do much, much more than one or a handful or people, they are much, much better at identifying legitimate issues than reviewers.

They also come up with a ton of false positives, too. A very sizable percentage of the problems reported are caused by the user not knowing what he or she is doing. Someone reading those comments may or may not be able to pick this up. The odds on picking this up increase if the particular user can be questioned, but even then, you can’t be 100% sure.
This person may well have a legitimate problem. The person might not entirely be telling the truth, or (more likely) leaving out something relevant (which might be way off the beaten path) that questioners don’t ask. Good troubleshooting and lots of spare parts can identify most problems, but not all.

In any case, user comments introduce a lot of static, which can sometimes overwhelm the message.

Even presuming you can eliminate most of the static, what you get from user comments is a skewed population. People with problems are more likely to say something than people who will not. While you can make some rough, crude adjustments and judgments to compensate for that, there’s no denying that whatever results or conclusions you draw are impressions based on anecdotal data.

While it’s false to say information derived that way cannot have value, I’d be the first to state that it’s far from ideal. However, if you can’t achieve ideal, do you just throw up your hands?

I’m not even going to talk about defining “failure” and its potential causes right now. I’ve thought through some possibilities based on the input I’ve gotten, and some of the potential answers are frightening.

Time to Ponder

People are getting burned buying systems. People want things that work almost all the time, and too many are not getting that now.

The current system works very badly. You may not think my trying to analyze user comments is a ton better, but anyone rational would agree we need something better.

Compared to other products, too high a percentage of computer products don’t work right. We need to take a closer look as to why. At the very least, we need to make a greater effort to somehow separate what can be identified as problems that can be dealt with as opposed to sheer random chance, or those that can’t.

I’ll talk about this next week, and certainly send me your thoughts, but this is going to be a bear to do properly.

I still think what I initially said is pretty valid based on certain angles and approaches for the risk-averse population it was meant for. However, the degree to which it, or for that matter any other assessement, is realistically meaningful to individuals generally needs much more study.

The whole intention is to get more reliable systems. No one should disagree with that.

Email Ed

Be the first to comment

Leave a Reply