PDA

View Full Version : Back-to-back EUE's


ihrsetrdr
01-06-06, 01:40 PM
My Abit NF7 / 2600 xp-m had been folding along just fine until about 36 hours ago, when all of a sudden it started trashing every WU it got, back to back. I noticed the evidence in my EOC stats yesterday(11 pt.s for like 5 or 6 WU's), the log text confirmed the problem. I shut down the client and ran prime 95 Benchmark, but it wouldn't complete the "torture test"->->->screenshot (http://www.geocities.com/hrsetrdr/OC_forums/Prime_NF7.JPG) . When I attempted to run 3D mark 2001 there was a message indicating "hardware failure" becuase a mouse wasn't connected to the machine. Well, this rig is running headless, so they were quite right about that. But, I began to suspect that perhaps the hardware failure referred to by Prime95's torture test was also due to not having input devices hooked to the machine. I'm not seeing any real signs of hardware failure here so I just dumped the whole fAh folder, created a new one and downloaded a fresh console executable. It's running project 2103 right now, so we''ll see how it goes.

TollhouseFrank
01-06-06, 02:08 PM
it just suddenly started wonkin' out on ya? Hmm.... ya haven't had any power surges or brownouts lately, have ya? Believe it or not, most of my EUE's come around those 2 events. Makes the PSU "just" unstable enough to put out a freaky dip/spike and blam... EUE

ihrsetrdr
01-06-06, 02:43 PM
it just suddenly started wonkin' out on ya? Hmm.... ya haven't had any power surges or brownouts lately, have ya? Believe it or not, most of my EUE's come around those 2 events. Makes the PSU "just" unstable enough to put out a freaky dip/spike and blam... EUE


Yes! Just had a power dip a couple days ago. All the other machines are fine; all the equipment is on UPS gear. I had to reset my dsl router and cable modem right after the power dip. Ya hit the nail right on the head, I do believe!
Thanks Frank...mystrey solved. :D


BTW, are there 2 ways to spell mystery? The message spell checker was content with mystrey.
:shrug:

ChasR
01-06-06, 02:52 PM
I doubt the message from Prime95 has anything to do with not having a keyboard or mouse hooked up. There just isn't anyway for lack of them to cause a rounding error. System instability. If it failed on the blend test, you should run the small FFT test. If it fails that, you've eliminated ram as the prime suspect. If it passes that, it's probably a bad stick of ram. Run the blended test with one then the other stick to find the bad one.

Joe Camel
01-06-06, 04:23 PM
ya, ive got a "generic store bought" PC @ work that only has a power and Network cord attached and its been FOLDing (QMDs) 24/7 for over a month now no problems...(i keep track with EMIII)


i hate to admit this but this new rig im setting up has EUEed about 10x QMDs while i was @ work today :cry: and it was 12 hour P95 (blend) stable.
(just so you dont feel too bad about the WU's)


GOOD LUCK!! (to both of us)

Joe Camel
01-06-06, 04:29 PM
I doubt the message from Prime95 has anything to do with not having a keyboard or mouse hooked up. There just isn't anyway for lack of them to cause a rounding error. System instability. If it failed on the blend test, you should run the small FFT test. If it fails that, you've eliminated ram as the prime suspect. If it passes that, it's probably a bad stick of ram. Run the blended test with one then the other stick to find the bad one.
could you state that again..

blend stresses....
small FFT stresses...

and i know this is NOT a "hard fast rule" just a "this *usually* means that", rule-of-thumb...

thanks :)

ChasR
01-06-06, 05:00 PM
The blended test stresses every thing with lots of ram use. With the small FFT test everything fits in the L2 cache so no ram is tested. If small FFT craps out it's a cpu related issue. If it passes small FFT, it's most likely a ram problem.

ihrsetrdr
01-06-06, 05:29 PM
O.K., I see how that works. so far, the NF7 failed the blended test but is currently at test 4 of the 12K portion of the small FTT test. My guess is that it will pass the entire small FTT test; am thinking of pulling ram sticks and re-running the blended test again, with each stick by itself. How many more sections of the small FTT test are there?

Joe Camel
01-06-06, 05:59 PM
dead on, same thing with one of my rigs!
(while testing the new rig, i noticed one of my "old" rigs started to EUE; and its a work rig @ stock :bang head)
making room for "ease of use" mem swapping procedure...


all the P95 tests are *ENDLESS* so its a time game...6 hours 12 hours 24 hours are usually the time frames one speaks of when stating one's rig is "P95 stable".

i use S-Pi, P95, sandra, and web surfing (in many combinations of simultaneous testing) as my...is this rig stable enough to FOLD on test.

(ive had a rig be Pi stable but not P95 @ one setting but then P95 stable and NOT Pi...hence i use both + )
even then it might not be FOLDing stable :rolleyes:


EDIT:
yup, 1x bumm stick-o-Mushkin Green line :(

now to see if setting O/B vid to 1MG will leave me enough mem for QMD's.
wont be dual channel mem but since this mobo doesn't OC, it'll be a dog anyway...:mad:
(nope, only like 400MB free :()

pscout
01-06-06, 07:15 PM
I recently found a few scattered eue's ... usually easy for me to detect since almost all my rigs produce wu's that are 450 ... so any odd numbers trigger me to look for the source. I backed off the fsb on one of the northwoods in a sometimes warm room by 5 and clean since.

So far I have not encountered 'hard' bad memory ... only soft in that i was pushing the oc or timings too far.

Temps on my rigs seem to be the trigger for my eue's ... so i get the temps lower or lower the fsb a bit.

During initial oc, sometimes i 'gamble' a bit by shortening initial test time on p95. Get a qmd going, back it up and disable the lan so it can't send in an EUE ,,, if it eue's then lower fsb usually is my fix rather than mess with mem timings, restore the backed up qmd, and restart the fold. So if i get an eue i have to restart the wu from the last good backup point, but if it runs clean, i know i am folding stable, and get qmd points from the testing period most of the time.

Since a qmd takes a day -6/+20 hrs depending on rig, this is a good test.

Joe Camel
01-06-06, 07:21 PM
I recently found a few scattered eue's ... usually easy for me to detect since almost all my rigs produce wu's that are 450 ... so any odd numbers trigger me to look for the source. I backed off the fsb on one of the northwoods in a sometimes warm room by 5 and clean since.

Temps on my rigs seem to be the trigger for eue's ... so i get the temps lower or lower the fsb a bit.

During initial oc, sometimes i 'gamble' a bit by shortening initial test time on p95. Get a qmd going, back it up and disable the lan so it can't send in an EUE ,,, if it ueu's. lower fsb usually is my fix rather than mess with mem timings, restore the backed up qmd, and restart the fold. So if i get an eue i have to restart the wu from the last backup point, but if it runs clean, i know i am folding stable, and get qmd points from the testing period most of the time.
NICE!!!

i know this has been stated 1k times B4 but 1x more for me;
you only need to copy the "work" folder and the "queue" file and thats all?

pscout
01-06-06, 07:34 PM
NICE!!!

i know this has been stated 1k times B4 but 1x more for me;
you only need to copy the "work" folder and the "queue" file and thats all?

Yes, just the work dir and queue.dat.

I delete the old work folder first so no garbage is left behind when i do the cut and paste. I also delete queue.dat - else u will get a popup asking if you want to replace it anyway so it is just as easy to delete it at the same time i delete the work dir and leaves no room for mistakes.

/edit Actually on backup restores like this i dont do a cut and paste since i want to keep the backup. On sneaker netting i do cut and paste.

ihrsetrdr
01-06-06, 08:29 PM
I ran the Blended test with each memory stick individually, and they both failed.
One stick was corsair XMS pc3200 and the other was a really generic stick of pc3200. ??? I ran 3D Mark2001 and the machine finished the benchmark with a rather decent score, for all settings @default(non-clocked). Not a bad score for my stock ->->Geforce4 ti4200 linky (http://www.geocities.com/hrsetrdr/OC_forums/3dMark.JPG) .

I re-started the fAh client and will let it fold overnight; I unplugged the network cable so I don't get anymore garbage credits.

Joe Camel
01-06-06, 08:46 PM
dont forget to copy the work folder and the queue file B4 you go to bed, so if it fails while you sleep, @ least you dont loose the WU (points)
;)

ihrsetrdr
01-06-06, 11:04 PM
dont forget to copy the work folder and the queue file B4 you go to bed, so if it fails while you sleep, @ least you dont loose the WU (points)
;)


Well, I'm stuck here at work right now until 8am so I'll just have to take it as it comes, as they say.

Sooo, I'm not entirely sure that both of these memory sticks are bad...could it be the memory controller on the motherboard? I guess I'll have to swap out some memory from one of my other machines and do more testing.

Any chance that the copy of Prime95(modded version) that I downloaded from xtreme systems could be corrupt? :shrug:

Joe Camel
01-06-06, 11:18 PM
wow, P95 is crazy enough "stock", a modded version sounds scary :p

i DL it from HERE (http://www.mersenne.org/freesoft.htm)

PS
dont forget the password is: 9876 so you can set priority etc :)

ihrsetrdr
01-06-06, 11:42 PM
My bad, I downloaded Prime95 from the same link you're showing...it was actually a modded version of Super PI (http://www.xtremesystems.com/pi/) I got from xtremesystems, not Prime95. I was confused !

:o

ChasR
01-07-06, 09:16 AM
You could loosen ram timings. Try each stick at 3-4-4-11.

pscout
01-07-06, 10:51 AM
it would seem unlikely that 2 sticks of ram would both fail together. Unless there were a common cause ... or the mem controller is the real problem.

Are you running high vdiimm? Does the mem get hot? my ocz EB mem likes 3+ v so i have to cool it or risk burning my fingers if i touch it (and to keep it stable)!

Swapping in other known good mem could also help isolate the cause assuming of course that the mobo is not damaging your ram.

ihrsetrdr
01-07-06, 01:19 PM
Everything is running stock settings, but I was thinking last night maybe I should bump up the vdimm a little to see if that helps. Right now this machine is about a 1/4 of the way through a P2103, and no sign of errors. I won't have time today, but tomorrow I'm going to put each of these memory sticks(one at a time) into another machine and run P95 to see how they perform.

Joe Camel
01-07-06, 01:41 PM
new mem in the "work" rig, and more Vdimm & Vcore to the Farm rig has solved my 2x problems (so far) the 4.0GHz has finished 1x QMD (490 PPD) and the work rig (no OC settings) is about 2/3 done (345 PPD)

thanks for the P95 hints ChasR :)

i absolutely LOVE that test plan Pscout!!

i will be testing more OC on the 1x rig.
now i dont have to worry about killing (EUE) WU's or loosing FOLDing time.
it was always a struggle for me to put more time finding max OC or just settle for average OC and get the thing FOLDing...now i can basically do BOTH :attn:

GOOD LUCK ihrsetrdr, sounds like you have things under control (again) :)

pscout
01-07-06, 02:10 PM
new mem in the "work" rig, and more Vdimm & Vcore to the Farm rig has solved my 2x problems (so far) the 4.0GHz has finished 1x QMD (490 PPD) and the work rig (no OC settings) is about 2/3 done (345 PPD)

thanks for the P95 hints ChasR :)

i absolutely LOVE that test plan Pscout!!

i will be testing more OC on the 1x rig.
now i dont have to worry about killing (EUE) WU's or loosing FOLDing time.
it was always a struggle for me to put more time finding max OC or just settle for average OC and get the thing FOLDing...now i can basically do BOTH :attn:

GOOD LUCK ihrsetrdr, sounds like you have things under control (again) :)

I still don't get it tho :-/ ... you have got a 630 in an as8, with good mem @ 4 ghz and still only getting 490 ppd? .. my 530j in as8 at 245 - 3675 is reporting 550 ppd (this was the rig i dropped fsb by 5 due to thermals to eliminate eue's) :-/

ihrsetrdr
01-09-06, 10:51 AM
ye gads! this P2103 is taking forever...started on Jan 06 at 10:41am and EM3 says it's due to finish today at around 1:38pm; about 41 minutes a frame. This no listing for this WU on the projects page...what the heck is it?

Joe Camel
01-10-06, 02:39 PM
I still don't get it tho :-/ ... you have got a 630 in an as8, with good mem @ 4 ghz and still only getting 490 ppd? .. my 530j in as8 at 245 - 3675 is reporting 550 ppd (this was the rig i dropped fsb by 5 due to thermals to eliminate eue's) :-/
yup, i now have 2x @ about the same settings getting about the same PPD.

the only rig i have that even comes close to what you and ChasR say (and DO) get PPD/GHz wise is a 630 in an Intel mobo (i915p) that hits 435 PPD @ 3.0GHz (stock) and 3-4-4-8 timings...

i was able to get the AS8-v 630 to break the 500 PPD but
1) that was @ 4.125GHz
2) wasnt stable

:shrug: (must be something IM doing, since ALL *MY* rigs are "poor producers") :shrug:

pscout
01-10-06, 09:39 PM
@ Joe ... I dunno :bang head :bang head .. should we resurrect your old 'help an amd guy' thread and go through them top to bottom? ... or start a new one?

I am game, and I am sure others will help ... I am cheap and don't like to leave ony production capacity unused. :D

Joe Camel
01-10-06, 09:53 PM
ya, i was thinking of starting a new...one. (have to be tomorrow if i do... too late now...sleepy)

only things i can come up with are:
OS (win 2k)
BIOS (as shipped)
drivers (shipped disk)

night :beer: