SETI And Overclocking

This article originally consisted of an email I sent SETI. I’ve removed it
since the relevant portions of it was cited by Mr. Korpela, and any problems or questions are certainly resolved
by his email.

I would like to thank SETI for responding quickly and honestly to this. Two thumbs up.

(Italics in the email are from my original email)

From: “Eric J. Korpela”

To: “Edward Stroligo”
Sent: Thursday, July 12, 2001 8:24 PM
Subject: Re: Overclocking and SETI

This would seem to indicate that you would rather not have results produced
by overclocked machines.

Is this an accurate statement? If that is not what you meant, then what did
you mean?

I would say it’s not an accurate statement. I would put it this way, it’s
not that most overclocked machines produce bogus results. That said, many
of our bogus results do come from overclocked machines. Many come from machine
that aren’t overclocked, but have a dust or pet hair problems. (We know this
because we have been informing some people who have problem machines, and
have gotten some feedback as to the cause of the problems).

If bad results from overclocked machines are a significant problem, have you
been able to quantify this problem to any extent? Is this a general problem
with overclocked machines, or perhaps just a problem with a subset of
overclocked machines that have been overclocked too much?

Bogus results aren’t really a problem. My most current estimate is that
<0.3% of our results come from brain damaged machines. Given our current level of redundant processing, this isn't a problem. On average, less than one in 7 million results (after redundancy checking) would be damaged.

If you are responsible about overclocking, and you verify that your machine
is operating properly by running a test workunit, like the Team Lambchop
test workunit , then there definitely not a problem. Your group could do
something similar (if you wanted) by picking any random workunit to verify
against.

(Ed.note: Go here for more on this)

There is a problem if you rely on “not crashing” as being an indicator of
reliable operation. The FPU tends to be quite a bit more speed and heat
sensitive than the integer portions of a chip. FPU errors won’t crash your
machine, but they will screw up SETI@home processing. While this isn’t a
big problem for us, it can be a big problem for you if you are relying on
your FPU for other things, like computations for finance.

If bad results from overclocked machines are not a significant problem, then
why was this mentioned in the newsletter?

The mention of overclocking had escaped my notice. David has removed it.
It was necessary to have some explanation of the “bad squares” in the click
plots, though. Sorry that it got blamed on overclokers in general.

Has any consideration been given to some sort of testing module or other
form of feedback that would tell users if they were consistently producing
faulty data?

We haven’t thought too much about formal testing. There’s an unoffical
test, as I mentioned, at Team Lambchop. . . . (Ed.note: This is described in the link above).

We’ve thought about giving feedback, and we have contacted some people
who send damaged results at high rates. Lower level problems (like a single
bad signal per result or people who run many machines, but only have one with
problems) are harder to diagnose real time and haven’t been enough
of a problem to worry us.

Eric


Email Ed

Be the first to comment

Leave a Reply