View Full Version : Is This Normal?
I've noticed something strange lately. Sometimes when I restart my machine or turn off SETI then back on again the WU I am working on will start over? Sometimes it will even start on a different WU than the one it was working on before the reboot.
Anyway, I am losing several hours sometimes and having to crunch again
Is anyone else getting this type of error? What could it be caused by?...Windows?...SPy?....Driver?...Hardware?
Running
Win98 SE
Driver 1.6.3.3
Spy 3.0.7
My system is listed below...if you need any other details let me know
Mictlan
08-09-01, 09:58 AM
Sometimes, when the temp.sah file gets corrupted SETI will start all over again or if you are using RAMDisk you will loose your data if you don't tell the program to save to HD before you quit the system.
As for the reboots, might be that your PSU is taking a beating. What size is your PSU? And what else are you running with it? As SETI uses a lot of the CPU, the CPU needs juice to flow, so it start to draw it from the PSU. If you are running a pelt you will be drawing lots of energy from the PSU and if near its limit, the PSU will fail, causing a reboot.
Otherwise I'm lost about this one.
[Oc]acaridans
08-09-01, 10:03 AM
I get the same problem, but only on my overclocked box, my other stock box's have no problems at all, I think what Mictlan makes sence
Sorry it took so long to reply...out of town for a few days but i had a few WU's to upload at least.
I'm not using RAMdisk or any type of tweak programs. I have a 250w PSU, but i'm running a BX board with PPGA to FCPGA socket adapter. No pelts, no watercooling, have 4 xtra 60mm fans + a 40mm running off the PSU other than that single hard drive, cdrom, cdrw, zip, floppy is all that is powered. Think I need a 300w PSU? I dont think that is the problem.
My chip is wired also...VSS to VID3 this was done recently, possible that i corrupted the temp file when testing tolerances of chip.
toomnymods
08-14-01, 11:02 AM
Dont feel to bad I've lost atleast 50 wu's that way, wish they'd redo the program to make it save when you close it out... That's when mine seems to erase.. :confused: So I try not to turn off my computers for reboots (adding hardware or software) till the wu is totally done then I close out seti..
Hope this helps..
Crunch on!!!
Morpheus
08-14-01, 03:15 PM
deez... was reading through this thread and wanted to make sure that your a REALLY losing that unit...
I just shut down & restarted Spy/Driver a half dozen times with no problem... shut down the computer & restarted that a couple times too... ??? No problems on my end... must be something there... check Spy... "setup" tab, "client" button... select, "ignore presence of state file on startup" if it wasn't before...
Of course, Spy will take up to a minute to "report" progress with the client... so if you reboot & look at the Spy it will say 0%... or "idle" until the first save... I know this is a "duh" type of thing, but I am trying to cover all the bases...
On rare occasions I have seen Driver start another WU after a reboot... but i have also witnessed it start a "new" WU at 40% or something... in other words when it cycled back to a previously unfinished WU, it did not reset, but rather started again from where it had left off...
Maybe a full re-install & config is in order... let me know if anything above helped... and I will continue to test and think on the problem too... might even shoot an email over to Mike Ober & see what he says...
Crunch On!!!
Morpheus...I am positive that I have lost partial WU's. I have even checked name of the WU in order to verify that it was crunching the same one again. As you mentioned I have also seen it cycle back to an unfinished WU several times.
It is definitely possible that my errors were because I was long overdue for a format/reinstall. I just put in a new HD today and am in the process of reinstalling everything right now.
Should have SETI back up in an hour or less...will crunch all weekend then post my results
Hopefully this will fix any probs
toomny...thats what I was doing for a while, but i always seemed to need to reboot right in the middle
Morpheus
08-17-01, 09:38 AM
email to Mike Ober sent...
Let's see what "da man" has to say.. :)
Crunch on!!!
Morpheus
08-17-01, 10:26 AM
OK.. here's Mike Ober's thought on the subject...
How far overclocked are the machine that are experiencing this problem? As you are probably aware, the first thing to go in an overclocked processor is the FPU, on which SETI relies heavily.
The other thing to watch is for changes in the last modified date of the work_unit.sah files. I have seen Windows update the date/time stamp on this file for no apparent reason. An easy way to see if this is the reason the WU progress resets is to clear the Hide Processing checkbox and watch the task name on the task bar. SETI Driver uses tasks named "SETI nn" where nn is the WU folder number. When a client process terminates, SETI Driver scans it's cache for the oldest work_unit.sah file in the cache and starts that WU.
If the problem is that Windows has updated the last modified time of work_unit.sah, SETI Driver will start a different WU and eventually get back to the one that was processing. If the problem is a FPU failure, the client might be generating garbage and filling up outfile.sah - the client will terminate processing and prepare results.sah when outfile.sah reaches 32Kb in size.
SETI Driver itself doesn't touch the WUs except to open and read the contents of the state.sah file and/or copy several of the .sah files to it's root folder for SETISpy. Any SETI monitoring program must also open the .SAH files for reading, which doesn't cause any problems for the SETI Client.
Crunch On!!!
--------------------------------------------------------------------------------
If the problem is that Windows has updated the last modified time of work_unit.sah, SETI Driver will start a different WU and eventually get back to the one that was processing. If the problem is a FPU failure, the client might be generating garbage and filling up outfile.sah - the client will terminate processing and prepare results.sah when outfile.sah reaches 32Kb in size.
--------------------------------------------------------------------------------
After running all weekend with new HD and a fresh install of everything I have not been able to duplicate my problem so far. It is definitely possible that I have errors due to FPU failure b/c sometimes I would get a WU that would stop around 45% or so and it would indicate that in the log.
I still have one question though......What would cause the WU to start processing from the beginning again??? And yes I have seen it process the same WU with no evidence of processing in the log....even seen it reprocess a WU after 80%+ completion......something corrupt in windows is my best guess
Will keep crunching and update with any news over the next few days/weeks
Morpheus
08-21-01, 12:28 AM
Have you tried and passed the Prime95 torture test? Even the most apparently "stable" overclocks can have problems...
Prime95 will really help you get to a point were you are both OCed to the max & doing stable calcs with your FPU...
Good luck...
Have not run Prime95 in a while.....completely slipped my mind....downloading now
thanks
Well today was probly one of my best SETI days ever after a WU with an AR of .048 i got 3 WU's in a row with AR's of 2.0, 9.3 and 7.1.........was able to finish in just over 6 hours each WU which is screaming for my celeron
No errors yet since reinstall...tried Prime95 but it was acting up so never got it running
crunched a few more WU's the past few days with no errors or problems....i must have needed a format bad
have made no other changes....my times are even looking better now
vBulletin® v3.8.7, Copyright ©2000-2012, vBulletin Solutions, Inc.