• Welcome to Overclockers Forums! Join us to reply in threads, receive reduced ads, and to customize your site experience!

Bad WU?

Overclockers is supported by our readers. When you click a link to make a purchase, we may earn a commission. Learn More.

Zerix01

Member
Joined
Mar 12, 2007
It looks like my client had problems yesterday getting a new WU. It segment faulted three times before it got a WU it liked. At first I thought I was having stability issues since I had just finished a large OC, but then I noticed that it didn't seem like any WU actually downloaded or it only got a fraction of the file and the client tried to start any way. What do you guys think? Was this just a fluke on their end?

BTW I tested my system with Prime blend for a day and a half before starting F@H again.

Code:
-----------------------------------------------------------------------
 Total                                      442153943.949952   100.0
-----------------------------------------------------------------------

               NODE (s)   Real (s)      (%)
       Time: 131684.000 131684.000    100.0
                       1d12h34:44
               (Mnbf/s)   (GFlops)   (ns/day)  (hour/ns)
Performance:     27.135      3.358      0.656     36.579
[11:13:24] Past main M.D. loop
[11:13:24] Will end MPI now
[11:14:24]
[11:14:24] Finished Work Unit:
[11:14:24] - Reading up to 3718704 from "work/wudata_03.arc": Read 3718704
[11:14:24] - Reading up to 1775116 from "work/wudata_03.xtc": Read 1775116
[11:14:24] goefile size: 0
[11:14:24] logfile size: 16913
[11:14:24] Leaving Run
[11:14:26] - Writing 5515133 bytes of core data to disk...
[11:14:26]   ... Done.
[11:14:27] - Shutting down core
[11:16:27]
[11:16:27] Folding@home Core Shutdown: FINISHED_UNIT
[0]0:Return code = 100
[0]1:Return code = 0, signaled with Quit
[0]2:Return code = 0, signaled with Quit
[0]3:Return code = 0, signaled with Quit
[11:16:32] CoreStatus = 64 (100)
[11:16:32] Unit 3 finished with 62 percent of time to deadline remaining.
[11:16:32] Updated performance fraction: 0.553101
[11:16:32] Sending work to server


[11:16:32] + Attempting to send results
[11:16:32] - Reading file work/wuresults_03.dat from core
[11:16:32]   (Read 5515133 bytes from disk)
[11:16:32] Connecting to http://171.64.65.56:8080/
[11:18:05] Posted data.
[11:18:05] Initial: 0000; - Uploaded at ~57 kB/s
[11:18:06] - Averaged speed for that direction ~56 kB/s
[11:18:06] + Results successfully sent
[11:18:06] Thank you for your contribution to Folding@Home.
[11:18:06] + Number of Units Completed: 46

[0]0:Return code = 18
[0]1:Return code = 0, signaled with Quit
[0]2:Return code = 0, signaled with Quit
[0]3:Return code = 0, signaled with Quit
[11:22:11] - Warning: Could not delete all work unit files (3): Core returned invalid code
[11:22:11] Trying to send all finished work units
[11:22:11] + No unsent completed units remaining.
[11:22:11] - Preparing to get new work unit...
[11:22:11] + Attempting to get work packet
[11:22:11] - Will indicate memory of 1005 MB
[11:22:11] - Connecting to assignment server
[11:22:11] Connecting to http://assign.stanford.edu:8080/
[11:22:11] Posted data.
[11:22:11] Initial: 40AB; - Successful: assigned to (171.64.65.56).
[11:22:11] + News From Folding@Home: Welcome to Folding@Home
[11:22:11] Loaded queue successfully.
[11:22:11] Connecting to http://171.64.65.56:8080/
[11:22:14] Posted data.
[11:22:14] Initial: 0000; - Receiving payload (expected size: 2436021)
[11:22:35] - Downloaded at ~113 kB/s
[11:22:35] - Averaged speed for that direction ~269 kB/s
[11:22:35] + Received work.
[11:22:35] Trying to send all finished work units
[11:22:35] + No unsent completed units remaining.
[11:22:35] + Closed connections
[11:22:35]
[11:22:35] + Processing work unit
[11:22:35] Core required: FahCore_a1.exe
[11:22:35] Core found.
[11:22:36] Working on Unit 04 [March 25 11:22:36]
[11:22:36] + Working ...
[11:22:36] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a1.exe -dir work/ -suffix 04 -checkpoint 15 -verbose -lifeline 7274 -version 601'

[11:22:36]
[11:22:36] *------------------------------*
[11:22:36] Folding@Home Gromacs SMP Core
[11:22:36] Version 1.74 (November 27, 2006)
[11:22:36]
[11:22:36] Preparing to commence simulation
[11:22:36] - Ensuring status. Please wait- Created dyn
[11:22:36] - Files status OK
[11:22:36] Error: Work unit read from disk is invalid
[11:22:36] Finalizing output
[11:22:36] - Expanded 2435509 -> 12886013 (decompressed 529.0 percent)
[11:22:36] - Starting from initial work packet
[11:22:36]
[11:22:36] Project: 2605 (Run 9, Clone 571, Gen 5)
[11:22:36]
[11:22:36] Assembly optimizations on if available.
[11:22:36] Entering M.D.
[11:22:53] 0 percent)
[11:22:53] - Starting from initial work packet
[11:22:53]
[11:22:53] Project: 2605 (Run 9, Clone 571, Gen 5)
[11:22:53]
[11:22:53] Entering M.D.
NNODES=4, MYRANK=2, HOSTNAME=Deepthought
NNODES=4, MYRANK=1, HOSTNAME=Deepthought
NNODES=4, MYRANK=0, HOSTNAME=Deepthought
NNODES=4, MYRANK=3, HOSTNAME=Deepthought
NODEID=0 argc=15
NODEID=3 argc=15
NODEID=2 argc=15
NODEID=1 argc=15
      Written by David van der Spoel, Erik Lindahl, Berk Hess, and others.
       Copyright (c) 1991-2000, University of Groningen, The Netherlands.
             Copyright (c) 2001-2004, The GROMACS development team,
            check out http://www.gromacs.org for more information.

        This inclusion of Gromacs code in the Folding@Home Core is under
        a special license (see http://folding.stanford.edu/gromacs.html)
         specially granted to Stanford by the copyright holders. If you
          are interested in using Gromacs, visit www.gromacs.org where
                you can download a free version of Gromacs under
         the terms of the GNU General Public License (GPL) as published
       by the Free Software Foundation; either version 2 of the License,
                     or (at your option) any later version.

starting mdrun 'Protein in POPC'
500000 steps,   1000.0 ps.

[11:23:01] Protein: Protein in POPC
[11:23:01] Writing local files
[11:23:01] Extra SSE boost OK.
[0]0:Return code = 0, signaled with Segmentation fault
[0]1:Return code = 0, signaled with Segmentation fault
[0]2:Return code = 0, signaled with Segmentation fault
[0]3:Return code = 0, signaled with Segmentation fault
[11:23:06] CoreStatus = 0 (0)
[11:23:06] Client-core communications error: ERROR 0x0
[11:23:06] Deleting current work unit & continuing...
[0]0:Return code = 0, signaled with Quit
[0]1:Return code = 18
[0]2:Return code = 0, signaled with Quit
[0]3:Return code = 0, signaled with Quit
[11:27:28] - Warning: Could not delete all work unit files (4): Core returned invalid code
[11:27:28] Trying to send all finished work units
[11:27:28] + No unsent completed units remaining.
[11:27:28] - Preparing to get new work unit...
[11:27:28] + Attempting to get work packet
[11:27:28] - Will indicate memory of 1005 MB
[11:27:28] - Connecting to assignment server
[11:27:28] Connecting to http://assign.stanford.edu:8080/
[11:27:28] Posted data.
[11:27:28] Initial: 40AB; - Successful: assigned to (171.64.65.56).
[11:27:28] + News From Folding@Home: Welcome to Folding@Home
[11:27:29] Loaded queue successfully.
[11:27:29] Connecting to http://171.64.65.56:8080/
[11:27:31] Posted data.
[11:27:31] Initial: 0000; - Receiving payload (expected size: 2436021)
[11:27:52] - Downloaded at ~113 kB/s
[11:27:52] - Averaged speed for that direction ~238 kB/s
[11:27:52] + Received work.
[11:27:52] + Closed connections
[11:27:57]
[11:27:57] + Processing work unit
[11:27:57] Core required: FahCore_a1.exe
[11:27:57] Core found.
[11:27:57] Working on Unit 05 [March 25 11:27:57]
[11:27:57] + Working ...
[11:27:57] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a1.exe -dir work/ -suffix 05 -checkpoint 15 -verbose -lifeline 7274 -version 601'

[11:27:57]
[11:27:57] *------------------------------*
[11:27:57] Folding@Home Gromacs SMP Core
[11:27:57] Version 1.74 (November 27, 2006)
[11:27:57]
[11:27:57] Preparing to commence simulation
[11:27:57] - Ensuring status. Please wait.
[11:28:14] - Looking at optimizations...
[11:28:14] - Working with standard loops on this execution.
[11:28:14] - Created dyn
[11:28:14] - Files status OK
[11:28:14] Error: Work unit read from disk is invalid
[11:28:14] Finalizing output
[11:28:15] - Expanded 2435509 -> 12886013 (decompressed 529.0 percent)
[11:28:15] - Starting from initial work packet
[11:28:15]
[11:28:15] Project: 2605 (Run 9, Clone 571, Gen 5)
[11:28:15]
[11:28:15] Entering M.D.
NNODES=4, MYRANK=2, HOSTNAME=Deepthought
NNODES=4, MYRANK=1, HOSTNAME=Deepthought
NNODES=4, MYRANK=0, HOSTNAME=Deepthought
NNODES=4, MYRANK=3, HOSTNAME=Deepthought
NODEID=0 argc=15
NODEID=3 argc=15
NODEID=2 argc=15
NODEID=1 argc=15
      Written by David van der Spoel, Erik Lindahl, Berk Hess, and others.
       Copyright (c) 1991-2000, University of Groningen, The Netherlands.
             Copyright (c) 2001-2004, The GROMACS development team,
            check out http://www.gromacs.org for more information.

        This inclusion of Gromacs code in the Folding@Home Core is under
        a special license (see http://folding.stanford.edu/gromacs.html)
         specially granted to Stanford by the copyright holders. If you
          are interested in using Gromacs, visit www.gromacs.org where
                you can download a free version of Gromacs under
         the terms of the GNU General Public License (GPL) as published
       by the Free Software Foundation; either version 2 of the License,
                     or (at your option) any later version.

[11:28:22] Rejecting checkpoint
starting mdrun 'Protein in POPC'
500000 steps,   1000.0 ps.

[11:28:23] Protein: Protein in POPCExtra SSE boost OK.
[11:28:23]
[11:28:23] Extra SSE boost OK.
[11:28:23] Writing local files
[11:28:23] Completed 0 out of 500000 steps  (0 percent)
[0]0:Return code = 0, signaled with Segmentation fault
[0]1:Return code = 0, signaled with Segmentation fault
[0]2:Return code = 0, signaled with Segmentation fault
[0]3:Return code = 0, signaled with Segmentation fault
[11:28:28] CoreStatus = 0 (0)
[11:28:28] Client-core communications error: ERROR 0x0
[11:28:28] Deleting current work unit & continuing...
[0]0:Return code = 0, signaled with Quit
[0]1:Return code = 0, signaled with Quit
[0]2:Return code = 18
[0]3:Return code = 0, signaled with Quit
[11:32:50] - Warning: Could not delete all work unit files (5): Core returned invalid code
[11:32:50] Trying to send all finished work units
[11:32:50] + No unsent completed units remaining.
[11:32:50] - Preparing to get new work unit...
[11:32:50] + Attempting to get work packet
[11:32:50] - Will indicate memory of 1005 MB
[11:32:50] - Connecting to assignment server
[11:32:50] Connecting to http://assign.stanford.edu:8080/
[11:32:50] Posted data.
[11:32:50] Initial: 40AB; - Successful: assigned to (171.64.65.56).
[11:32:50] + News From Folding@Home: Welcome to Folding@Home
[11:32:50] Loaded queue successfully.
[11:32:50] Connecting to http://171.64.65.56:8080/
[11:32:53] Posted data.
[11:32:53] Initial: 0000; - Receiving payload (expected size: 2436021)
[11:33:15] - Downloaded at ~108 kB/s
[11:33:15] - Averaged speed for that direction ~212 kB/s
[11:33:15] + Received work.
[11:33:15] + Closed connections
[11:33:20]
[11:33:20] + Processing work unit
[11:33:20] Core required: FahCore_a1.exe
[11:33:20] Core found.
[11:33:20] Working on Unit 06 [March 25 11:33:20]
[11:33:20] + Working ...
[11:33:20] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a1.exe -dir work/ -suffix 06 -checkpoint 15 -verbose -lifeline 7274 -version 601'

[11:33:20]
[11:33:20] *------------------------------*
[11:33:20] Folding@Home Gromacs SMP Core
[11:33:20] Version 1.74 (November 27, 2006)
[11:33:20]
[11:33:20] Preparing to commence simulation
[11:33:20] - Ensuring status. Please wait.
[11:33:37] - Looking at optimizations...
[11:33:37] - Working with standard loops on this execution.
[11:33:37] - Created dyn
[11:33:37] - Files status OK
[11:33:37] Error: Work unit read from disk is invalid
[11:33:37] Finalizing output
[11:33:38] - Expanded 2435509 -> 12886013 (decompressed 529.0 percent)
[11:33:38] - Starting from initial work packet
[11:33:38]
[11:33:38] Project: 2605 (Run 9, Clone 571, Gen 5)
[11:33:38]
[11:33:38] Entering M.D.
NNODES=4, MYRANK=0, HOSTNAME=Deepthought
NNODES=4, MYRANK=1, HOSTNAME=Deepthought
NNODES=4, MYRANK=2, HOSTNAME=Deepthought
NNODES=4, MYRANK=3, HOSTNAME=Deepthought
NODEID=1 argc=15
NODEID=0 argc=15
NODEID=3 argc=15
NODEID=2 argc=15
      Written by David van der Spoel, Erik Lindahl, Berk Hess, and others.
       Copyright (c) 1991-2000, University of Groningen, The Netherlands.
             Copyright (c) 2001-2004, The GROMACS development team,
            check out http://www.gromacs.org for more information.

        This inclusion of Gromacs code in the Folding@Home Core is under
        a special license (see http://folding.stanford.edu/gromacs.html)
         specially granted to Stanford by the copyright holders. If you
          are interested in using Gromacs, visit www.gromacs.org where
                you can download a free version of Gromacs under
         the terms of the GNU General Public License (GPL) as published
       by the Free Software Foundation; either version 2 of the License,
                     or (at your option) any later version.

[11:33:45] Rejecting checkpoint
[11:33:45] OPC
[11:33:45] Writing local files
starting mdrun 'Protein in POPC'
500000 steps,   1000.0 ps.

[11:33:46] Extra SSE boost OK.
[11:33:46]
[11:33:46] Extra SSE boost OK.
[11:33:46] Writing local files
[11:33:46] Completed 0 out of 500000 steps  (0 percent)
[0]0:Return code = 0, signaled with Segmentation fault
[0]1:Return code = 0, signaled with Segmentation fault
[0]2:Return code = 0, signaled with Segmentation fault
[0]3:Return code = 0, signaled with Segmentation fault
[11:33:51] CoreStatus = 0 (0)
[11:33:51] Client-core communications error: ERROR 0x0
[11:33:51] - Attempting to download new core...
[11:33:51] + Downloading new core: FahCore_a1.exe
[11:33:51] Downloading core (/~pande/Linux/x86//Core_a1.fah from www.stanford.edu)
[11:33:51] Initial: AFDE; + 10240 bytes downloaded
[11:33:52] Initial: B54E; + 20480 bytes downloaded
[11:33:52] Initial: D6C2; + 30720 bytes downloaded
[11:33:52] Initial: 9F08; + 40960 bytes downloaded
[11:33:52] Initial: C6C3; + 51200 bytes downloaded
[11:33:52] Initial: EBA8; + 61440 bytes downloaded
[11:33:52] Initial: 3141; + 71680 bytes downloaded
[11:33:52] Initial: D218; + 81920 bytes downloaded
[11:33:52] Initial: F7AC; + 92160 bytes downloaded
[11:33:52] Initial: 820B; + 102400 bytes downloaded
[11:33:52] Initial: 1B1E; + 112640 bytes downloaded
[11:33:53] Initial: C249; + 122880 bytes downloaded
[11:33:53] Initial: 5EBD; + 133120 bytes downloaded
[11:33:53] Initial: CD6C; + 143360 bytes downloaded
[11:33:53] Initial: 221C; + 153600 bytes downloaded
[11:33:53] Initial: DB18; + 163840 bytes downloaded
[11:33:53] Initial: 237E; + 174080 bytes downloaded
[11:33:53] Initial: AEEC; + 184320 bytes downloaded
[11:33:53] Initial: 4C66; + 194560 bytes downloaded
[11:33:53] Initial: AE1E; + 204800 bytes downloaded
[11:33:53] Initial: A37E; + 215040 bytes downloaded
[11:33:53] Initial: 8193; + 225280 bytes downloaded
[11:33:53] Initial: 9F05; + 235520 bytes downloaded
[11:33:54] Initial: AAA5; + 245760 bytes downloaded
[11:33:54] Initial: 6400; + 256000 bytes downloaded
[11:33:54] Initial: 6E3D; + 266240 bytes downloaded
[11:33:54] Initial: EA6B; + 276480 bytes downloaded
[11:33:54] Initial: 820A; + 286720 bytes downloaded
[11:33:55] Initial: DE6D; + 296960 bytes downloaded
[11:33:55] Initial: B97B; + 307200 bytes downloaded
[11:33:55] Initial: 9D5D; + 317440 bytes downloaded
[11:33:55] Initial: 91D7; + 327680 bytes downloaded
[11:33:55] Initial: BB3B; + 337920 bytes downloaded
[11:33:55] Initial: 611B; + 348160 bytes downloaded
[11:33:55] Initial: B290; + 358400 bytes downloaded
[11:33:55] Initial: B0AA; + 368640 bytes downloaded
[11:33:55] Initial: 6A85; + 378880 bytes downloaded
[11:33:55] Initial: BF10; + 389120 bytes downloaded
[11:33:55] Initial: A818; + 399360 bytes downloaded
[11:33:55] Initial: 90E1; + 409600 bytes downloaded
[11:33:55] Initial: 2869; + 419840 bytes downloaded
[11:33:55] Initial: CAFE; + 430080 bytes downloaded
[11:33:55] Initial: 414B; + 440320 bytes downloaded
[11:33:55] Initial: 9B7A; + 450560 bytes downloaded
[11:33:55] Initial: 33AA; + 460800 bytes downloaded
[11:33:55] Initial: B1D5; + 471040 bytes downloaded
[11:33:56] Initial: 0206; + 481280 bytes downloaded
[11:33:56] Initial: 11F4; + 491520 bytes downloaded
[11:33:56] Initial: 31B5; + 501760 bytes downloaded
[11:33:56] Initial: 46B2; + 512000 bytes downloaded
[11:33:56] Initial: 3113; + 522240 bytes downloaded
[11:33:56] Initial: 525A; + 532480 bytes downloaded
[11:33:56] Initial: 66F9; + 542720 bytes downloaded
[11:33:56] Initial: 9672; + 552960 bytes downloaded
[11:33:56] Initial: 9058; + 563200 bytes downloaded
[11:33:56] Initial: 49ED; + 573440 bytes downloaded
[11:33:56] Initial: 515D; + 583680 bytes downloaded
[11:33:56] Initial: CAC0; + 593920 bytes downloaded
[11:33:57] Initial: 0B15; + 604160 bytes downloaded
[11:33:57] Initial: 5A89; + 614400 bytes downloaded
[11:33:57] Initial: 0F31; + 624640 bytes downloaded
[11:33:58] Initial: 2BC3; + 634880 bytes downloaded
[11:33:58] Initial: 3C06; + 645120 bytes downloaded
[11:33:58] Initial: 89C7; + 655360 bytes downloaded
[11:33:58] Initial: 6C54; + 665600 bytes downloaded
[11:33:58] Initial: 8D4D; + 675840 bytes downloaded
[11:33:58] Initial: EA59; + 686080 bytes downloaded
[11:33:58] Initial: C563; + 696320 bytes downloaded
[11:33:58] Initial: 8D45; + 706560 bytes downloaded
[11:33:58] Initial: 9BD0; + 716800 bytes downloaded
[11:33:58] Initial: 130C; + 727040 bytes downloaded
[11:33:58] Initial: CDA1; + 737280 bytes downloaded
[11:33:58] Initial: 7681; + 747520 bytes downloaded
[11:33:58] Initial: 1110; + 757760 bytes downloaded
[11:33:58] Initial: EE35; + 768000 bytes downloaded
[11:33:58] Initial: E5E1; + 778240 bytes downloaded
[11:33:58] Initial: 4B97; + 788480 bytes downloaded
[11:33:58] Initial: 4D75; + 798720 bytes downloaded
[11:33:58] Initial: E268; + 808960 bytes downloaded
[11:33:58] Initial: FAC6; + 819200 bytes downloaded
[11:33:58] Initial: A625; + 829440 bytes downloaded
[11:33:59] Initial: A12A; + 839680 bytes downloaded
[11:33:59] Initial: 83A3; + 849920 bytes downloaded
[11:33:59] Initial: 3BEA; + 860160 bytes downloaded
[11:33:59] Initial: 5298; + 870400 bytes downloaded
[11:33:59] Initial: 4811; + 880640 bytes downloaded
[11:33:59] Initial: EB07; + 890880 bytes downloaded
[11:33:59] Initial: 83FC; + 901120 bytes downloaded
[11:33:59] Initial: FA4E; + 911360 bytes downloaded
[11:33:59] Initial: 2945; + 921600 bytes downloaded
[11:33:59] Initial: 6BC9; + 931840 bytes downloaded
[11:33:59] Initial: E495; + 942080 bytes downloaded
[11:34:00] Initial: 1050; + 952320 bytes downloaded
[11:34:00] Initial: 2070; + 962560 bytes downloaded
[11:34:00] Initial: 1083; + 972800 bytes downloaded
[11:34:00] Initial: 96E5; + 983040 bytes downloaded
[11:34:00] Initial: 3EEE; + 993280 bytes downloaded
[11:34:00] Initial: 84AC; + 1003520 bytes downloaded
[11:34:00] Initial: 3B6B; + 1013760 bytes downloaded
[11:34:00] Initial: 3030; + 1024000 bytes downloaded
[11:34:00] Initial: 4B95; + 1034240 bytes downloaded
[11:34:00] Initial: D9BC; + 1044480 bytes downloaded
[11:34:00] Initial: C5B8; + 1054720 bytes downloaded
[11:34:00] Initial: A5EF; + 1064960 bytes downloaded
[11:34:01] Initial: 28DC; + 1075200 bytes downloaded
[11:34:01] Initial: 0943; + 1085440 bytes downloaded
[11:34:01] Initial: 338A; + 1095680 bytes downloaded
[11:34:01] Initial: ADFC; + 1105920 bytes downloaded
[11:34:01] Initial: ED39; + 1116160 bytes downloaded
[11:34:01] Initial: D284; + 1126400 bytes downloaded
[11:34:01] Initial: 0057; + 1136640 bytes downloaded
[11:34:01] Initial: 3E65; + 1146880 bytes downloaded
[11:34:01] Initial: FCB5; + 1157120 bytes downloaded
[11:34:01] Initial: A7D8; + 1167360 bytes downloaded
[11:34:01] Initial: A564; + 1177600 bytes downloaded
[11:34:02] Initial: 7654; + 1187840 bytes downloaded
[11:34:02] Initial: 0848; + 1198080 bytes downloaded
[11:34:02] Initial: 471E; + 1208320 bytes downloaded
[11:34:02] Initial: A7F3; + 1218560 bytes downloaded
[11:34:02] Initial: FA59; + 1228800 bytes downloaded
[11:34:02] Initial: FBF2; + 1239040 bytes downloaded
[11:34:02] Initial: F54E; + 1249280 bytes downloaded
[11:34:02] Initial: 3023; + 1259520 bytes downloaded
[11:34:02] Initial: AB37; + 1269760 bytes downloaded
[11:34:02] Initial: 0896; + 1280000 bytes downloaded
[11:34:02] Initial: 756D; + 1290240 bytes downloaded
[11:34:02] Initial: C1E7; + 1300480 bytes downloaded
[11:34:03] Initial: 9AAC; + 1310720 bytes downloaded
[11:34:03] Initial: E5AF; + 1320960 bytes downloaded
[11:34:03] Initial: BBE3; + 1331200 bytes downloaded
[11:34:03] Initial: 3596; + 1341440 bytes downloaded
[11:34:03] Initial: 924C; + 1351680 bytes downloaded
[11:34:03] Initial: 30B7; + 1361920 bytes downloaded
[11:34:03] Initial: AEB7; + 1372160 bytes downloaded
[11:34:03] Initial: 7D25; + 1382400 bytes downloaded
[11:34:03] Initial: 0FEB; + 1392640 bytes downloaded
[11:34:03] Initial: 3131; + 1402880 bytes downloaded
[11:34:03] Initial: 755F; + 1413120 bytes downloaded
[11:34:04] Initial: 4800; + 1423360 bytes downloaded
[11:34:04] Initial: 1282; + 1433600 bytes downloaded
[11:34:04] Initial: B2A3; + 1443840 bytes downloaded
[11:34:04] Initial: 21E9; + 1454080 bytes downloaded
[11:34:04] Initial: 789E; + 1464320 bytes downloaded
[11:34:04] Initial: 8542; + 1474560 bytes downloaded
[11:34:04] Initial: 3A56; + 1484800 bytes downloaded
[11:34:04] Initial: D4FE; + 1490945 bytes downloaded
[11:34:04] Verifying core Core_a1.fah...
[11:34:04] Signature is VALID
[11:34:04]
[11:34:04] Trying to unzip core FahCore_a1.exe
[11:34:05] Decompressed FahCore_a1.exe (3625104 bytes) successfully
[11:34:05] + Core successfully engaged
[11:34:05] Deleting current work unit & continuing...
[0]0:Return code = 0, signaled with Quit
[0]1:Return code = 0, signaled with Quit
[0]2:Return code = 0, signaled with Quit
[0]3:Return code = 18
[11:38:27] - Warning: Could not delete all work unit files (6): Core returned invalid code
[11:38:27] Trying to send all finished work units
[11:38:27] + No unsent completed units remaining.
[11:38:27] - Preparing to get new work unit...
[11:38:27] + Attempting to get work packet
[11:38:27] - Will indicate memory of 1005 MB
[11:38:27] - Connecting to assignment server
[11:38:27] Connecting to http://assign.stanford.edu:8080/
[11:38:27] Posted data.
[11:38:27] Initial: 40AB; - Successful: assigned to (171.64.65.56).
[11:38:27] + News From Folding@Home: Welcome to Folding@Home
[11:38:27] Loaded queue successfully.
[11:38:27] Connecting to http://171.64.65.56:8080/
[11:38:27] Posted data.
[11:38:27] Initial: 0000; - Error: Bad packet type from server, expected work assignment
[11:38:27] - Attempt #1  to get work failed, and no other work to do.
             Waiting before retry.
[11:38:40] + Attempting to get work packet
[11:38:40] - Will indicate memory of 1005 MB
[11:38:40] - Connecting to assignment server
[11:38:40] Connecting to http://assign.stanford.edu:8080/
[11:38:40] Posted data.
[11:38:40] Initial: 40AB; - Successful: assigned to (171.64.65.56).
[11:38:40] + News From Folding@Home: Welcome to Folding@Home
[11:38:40] Loaded queue successfully.
[11:38:40] Connecting to http://171.64.65.56:8080/
[11:38:43] Posted data.
[11:38:43] Initial: 0000; - Receiving payload (expected size: 2434520)
[11:39:04] - Downloaded at ~113 kB/s
[11:39:04] - Averaged speed for that direction ~192 kB/s
[11:39:04] + Received work.
[11:39:04] + Closed connections
[11:39:09]
[11:39:09] + Processing work unit
[11:39:09] Core required: FahCore_a1.exe
[11:39:09] Core found.
[11:39:09] Working on Unit 07 [March 25 11:39:09]
[11:39:09] + Working ...
[11:39:09] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a1.exe -dir work/ -suffix 07 -checkpoint 15 -verbose -lifeline 7274 -version 601'

[11:39:09]
[11:39:09] *------------------------------*
[11:39:09] Folding@Home Gromacs SMP Core
[11:39:09] Version 1.74 (November 27, 2006)
[11:39:09]
[11:39:09] Preparing to commence simulation
[11:39:09] - Ensuring status. Please wait.
[11:39:26] - Looking at optimizations...
[11:39:26] - Working with standard loops on this execution.
[11:39:26] - Created dyn
[11:39:26] - Files status OK
[11:39:26] Error: Work unit read from disk is invalid
[11:39:26] Finalizing output
[11:39:27] - Expanded 2434008 -> 12897049 (decompressed 529.8 percent)
[11:39:27] - Starting from initial work packet
[11:39:27]
[11:39:27] Project: 2605 (Run 7, Clone 492, Gen 36)
[11:39:27]
[11:39:27] Entering M.D.
NNODES=4, MYRANK=0, HOSTNAME=Deepthought
NNODES=4, MYRANK=2, HOSTNAME=Deepthought
NNODES=4, MYRANK=1, HOSTNAME=Deepthought
NNODES=4, MYRANK=3, HOSTNAME=Deepthought
NODEID=3 argc=15
NODEID=2 argc=15
NODEID=0 argc=15
NODEID=1 argc=15
      Written by David van der Spoel, Erik Lindahl, Berk Hess, and others.
       Copyright (c) 1991-2000, University of Groningen, The Netherlands.
             Copyright (c) 2001-2004, The GROMACS development team,
            check out http://www.gromacs.org for more information.

        This inclusion of Gromacs code in the Folding@Home Core is under
        a special license (see http://folding.stanford.edu/gromacs.html)
         specially granted to Stanford by the copyright holders. If you
          are interested in using Gromacs, visit www.gromacs.org where
                you can download a free version of Gromacs under
         the terms of the GNU General Public License (GPL) as published
       by the Free Software Foundation; either version 2 of the License,
                     or (at your option) any later version.

[11:39:34] Rejecting checkpoint
[11:39:35] Protein: Protein in POPC
starting mdrun 'Protein in POPC'
500000 steps,   1000.0 ps.

[11:39:35] xtra SSE boost OK.
[11:39:35]
[11:39:35] Extra SSE boost OK.
[11:39:36] Writing local files
[11:39:36] Completed 0 out of 500000 steps  (0 percent)
[11:54:35] Timered checkpoint triggered.
[11:58:38] - Autosending finished units...
[11:58:38] Trying to send all finished work units
[11:58:38] + No unsent completed units remaining.
[11:58:38] - Autosend completed
[12:01:49] Writing local files
[12:01:49] Completed 5000 out of 500000 steps  (1 percent)
[12:16:49] Timered checkpoint triggered.
[12:26:12] Writing local files
 
If the vcore on that 3600 is still @ 1.25v with the OC @ 2.7GHz I'd give it a bump up to 1.30v and see if that doesn't cure your problem.

Save a networking fluke, I'd say something isn't quite stable and I bet it's the cpu.
 
SInce it failed on the same WU three times in the same place, it is likely to be a WU problem. Had it failed in a different place each time, it would indicate a hardware problem.
 
If the vcore on that 3600 is still @ 1.25v with the OC @ 2.7GHz I'd give it a bump up to 1.30v and see if that doesn't cure your problem.

Save a networking fluke, I'd say something isn't quite stable and I bet it's the cpu.

Well I do think I have hit the wall with this CPU. If I raise the Bus up 2MHz more than the system will lock up during Prime. I did raise the voltage at that point in .25v steps up to 1.3v but at all voltage steps the system would lock up. Also with the voltage being brought up my cpu temps rocketed to over 60C which is too high for my liking and possibly why it was locking up. I lowered the bus down to where it is now and kept the vcore at stock and Prime ran good. I also completed one whole WU before this issue occurred and the one I'm on now is at 40% with no issues.

I kind of dismissed it being a stability issue because if you look at my log the three times it failed it didn't look like it even downloaded anything. Then before my current WU started, you can see the file being downloaded. It is just odd that it even tried to start processing the bad WU in the first place.
 
I kind of dismissed it being a stability issue because if you look at my log the three times it failed it didn't look like it even downloaded anything. Then before my current WU started, you can see the file being downloaded. It is just odd that it even tried to start processing the bad WU in the first place.
Actually, that long line of download status is the Core. It re-downloaded it in case it got corrupted.

It did download it 3 times from the server. On the 4th attempt the server told the client "too many times" and gave it a different one.
 
Back