• Welcome to Overclockers Forums! Join us to reply in threads, receive reduced ads, and to customize your site experience!

P3062: Client-core communications error: ERROR 0x1

Overclockers is supported by our readers. When you click a link to make a purchase, we may earn a commission. Learn More.

ihrsetrdr

Señor Senior Member
Joined
May 17, 2005
Location
High Desert, Calif.
Seven-in-a-row of these P3062's have failed at 90%. :eek:


Here's a complete(shortened) output of one WU:

Code:
  [20:08:26] *------------------------------*
[20:08:26] Folding@Home Gromacs SMP Core
[20:08:26] Version 1.74 (November 27, 2006)
[20:08:26] 
[20:08:26] Preparing to commence simulation
[20:08:26] - Ensuring status. Please wait.
[20:08:43] - Looking at optimizations...
[20:08:43] - Working with standard loops on this execution.
[20:08:43] - Previous termination of core w- Expanded 609871 -> 3263133 (d- Expanded 609871 -> 326313- Starting from initial work pa- Starting from initial work pa- Starting from initial work packet
[20:08:43] 
[20:08:43] Project: 3062 (Run 2, Clone 9, Gen 13)
[20:08:43] 
[20:08:43] Entering M.D.
[20:08:49] Protein: p3062_lambdaProtein: p3062_lambda5_99sb
[20:08:49] Writing local files
[20:08:50] Extra SSE boost OK.
[20:08:50] Writing local files
[20:08:50] Completed 0 out of 5000000 steps  (0 percent)
[20:16:36] Writing local files
[20:16:36] Completed 50000 out of 5000000 steps  (1 percent)
[20:24:36] Writing local files
[20:24:36] Completed 100000 out of 5000000 steps  (2 percent)
[20:32:25] Writing local files
[20:32:25] Completed 150000 out of 5000000 steps  (3 percent)
                              [color=#8080FF]**edited for space**[/color]
[07:53:04] Writing local files
[07:53:04] Completed 4450000 out of 5000000 steps  (89 percent)
[08:00:54] Writing local files
[08:00:54] Completed 4500000 out of 5000000 steps  (90 percent)
[08:07:37] Warning:  long 1-4 interactions
[08:07:42] CoreStatus = 1 (1)
[08:07:42] Client-core communications error: ERROR 0x1
[08:07:42] Deleting current work unit & continuing...
[08:12:03] - Warning: Could not delete all work unit files (8): Core returned invalid code
[08:12:03] Trying to send all finished work units
[08:12:03] + No unsent completed units remaining.
[08:12:03] - Preparing to get new work unit...
[08:12:03] + Attempting to get work packet
[08:12:03] - Will indicate memory of 1004 MB

Did we ever figure out what causes the "ERROR 0x1" ? I searched the forums but didn't find anything definitive. Something to do with memory? voltage? Or?

System specs:
GA DS3
Q6600 @3.15ghz(9x350)
1024mb gskill ddr2 6400
Linux 2.6.22-14

Here's my dmesg results. Some of the references like:
ata3: SATA max UDMA/133 cmd 0xffffc200005e4100 ctl 0x0000000000000000 bmdma 0x0000000000000000 have me concerned; don't recall seeing such on other machines.

I'm thinking hardware problem....?
 
Hardware related, for sure. I took a long list of steps troubleshooting it, finally pulled it out of the case, returned to stock clocks, swapped cdrom & hdd(including cables) and changed out the heatsink fan. Did fresh install of Debian(lenny) and set up the fah client. A p3064(new project?) is currently running, and is at 54% completion. I'll probably bump up the OC after the WU returns.

Even though 11 hours of MemTest revealed no errors, I still can't help but believe there may be some issue with these Corsair modules. I had the originals in a different system last year, and had constant problems. I RMA'd the originals and these are the replacements. I'm guessing that they were not new stock, but merely modules someone else had returned. As memory prices are really low for ddr2 right now, I went ahead and ordered some G.SKILL (2 x 1GB) 240-Pin DDR2 800 for half of what I payed for this product, just a few months ago.
 
I believe you're exactly right. Congrats on finding it, not an easy thing to do. :beer:
 
Even though 11 hours of MemTest revealed no errors, I still can't help but believe there may be some issue with these Corsair modules. I had the originals in a different system last year, and had constant problems. I RMA'd the originals and these are the replacements. I'm guessing that they were not new stock, but merely modules someone else had returned. As memory prices are really low for ddr2 right now, I went ahead and ordered some G.SKILL (2 x 1GB) 240-Pin DDR2 800 for half of what I payed for this product, just a few months ago.

I had the same problem happen. It was two corsair DDR1066 sticks that would pass every stress test. It would kernel panic sometimes when running F@H or sometimes give an error. I ordered some new ram while I RMAed the old ones and that fixed it. I ended up with some Crucial DDR800 that overclocks better than the Corsair too.
 
ihrsetrdr: Seven-in-a-row of these P3062's have failed at 90%.

I'm glad you fixed the problem, but I don't understand why the wu quit at the same point every time (90%) if it wasn't wu related. Is it possible that the wu becomes unusually demanding at 90%? Enough to cause a memory problem in both your machine and mine?

I had the same problem with a 3062 a few weeks ago. It stopped at exactly 90% five times before I caught it. I turned the computer off, went to bed, and downloaded another wu in the morning. (not a 3062) It's been running perfectly ever since. I don't know if it's ever gotten another 3062 or not so I don't know how it would handle another 3062. The memory on my machine is Crucial ddr2 8500.
 
Glad u got it fixed ih! :beer: I have to say that I'm a fan of both Crucial and G.Skill DDR2 memory. Both have been very solid for me to date... to heck with Corsair memory! :D
 
Glad u got it fixed ih! :beer: I have to say that I'm a fan of both Crucial and G.Skill DDR2 memory. Both have been very solid for me to date... to heck with Corsair memory! :D

I used to be strictly Corsair, but the G. Skill dimms have been 'doing it' for my rigs just fine.

This machine finished up the P3064 and is into another one; TPF is just under 10 minutes. I won't bother putting any OC to it until the G. Skill arrives(later today or manana).
 
Back