• Welcome to Overclockers Forums! Join us to reply in threads, receive reduced ads, and to customize your site experience!

Unknown WU

Overclockers is supported by our readers. When you click a link to make a purchase, we may earn a commission. Learn More.

Shelnutt2

Overclockers Team Content Editor
Joined
Jun 17, 2005
Location
/home/
Ok, long story short, I deleted my work folder experimenting, and when it downloaded a new WU I've got a weird one. I have no clue what it is, and FahMon doesn't know either. Linux SMP Beta.

Log file (of just this WU)
Code:
--- Opening Log file [April 29 23:24:24] 


# SMP Client ##################################################################
###############################################################################

                       Folding@Home Client Version 5.91beta3

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: /home/shelnutt/folding/SMP
Executable: ./fah5
Arguments: -verbosity 9 

[23:24:24] - Ask before connecting: No
[23:24:24] - User name: Shelnutt2 (Team 32)
[23:24:24] - User ID: 15EB378B33135885
[23:24:24] - Machine ID: 1
[23:24:24] 
[23:24:24] Work directory not found. Creating...
[23:24:25] Loaded queue successfully.
[23:24:25] - Autosending finished units...
[23:24:25] Trying to send all finished work units
[23:24:25] + No unsent completed units remaining.
[23:24:25] - Autosend completed
[23:24:25] 
[23:24:25] + Processing work unit
[23:24:25] Core required: FahCore_a1.exe
[23:24:25] Core found.
[23:24:25] Working on Unit 00 [April 29 23:24:25]
[23:24:25] + Working ...
[23:24:25] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a1.exe -dir work/ -suffix 00 -checkpoint 30 -verbose -lifeline 27823 -version 591'

[23:24:25] 
[23:24:25] *------------------------------*
[23:24:25] Folding@Home Gromacs SMP Core
[23:24:25] Version 1.73 (November 27, 2006)
[23:24:25] 
[23:24:25] Preparing to commence simulation
[23:24:25] - Ensuring status. Please wait.
[23:24:25] put
[23:24:42] - Starting from initial work packet
[23:24:42] 
[23:24:42] Project: 0 (Run 0, Clone 0, Gen 0)
[23:24:42] 
[23:24:42] Error: Could not write local file.  Exiting.
[23:24:42] - Shutting down core
[23:26:29] CoreStatus = 12 (18)
[23:26:29] Client-core communications error: ERROR 0x12
[23:26:29] Deleting current work unit & continuing...
[23:30:50] - Warning: Could not delete all work unit files (0): Core returned invalid code
[23:30:50] Trying to send all finished work units
[23:30:50] + No unsent completed units remaining.
[23:30:50] - Preparing to get new work unit...
[23:30:50] + Attempting to get work packet
[23:30:50] - Will indicate memory of 1004 MB
[23:30:50] - Connecting to assignment server
[23:30:50] Connecting to http://assign.stanford.edu:8080/
[23:30:50] Posted data.
[23:30:50] Initial: 40AB; - Successful: assigned to (171.64.65.56).
[23:30:50] + News From Folding@Home: Welcome to Folding@Home
[23:30:50] Loaded queue successfully.
[23:30:50] Connecting to http://171.64.65.56:8080/
[23:30:55] Posted data.
[23:30:55] Initial: 0000; - Receiving payload (expected size: 4017496)
[23:30:59] - Downloaded at ~980 kB/s
[23:30:59] - Averaged speed for that direction ~864 kB/s
[23:30:59] + Received work.
[23:30:59] + Closed connections
[23:31:04] 
[23:31:04] + Processing work unit
[23:31:04] Core required: FahCore_a1.exe
[23:31:04] Core found.
[23:31:04] Working on Unit 01 [April 29 23:31:04]
[23:31:04] + Working ...
[23:31:04] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a1.exe -dir work/ -suffix 01 -checkpoint 30 -verbose -lifeline 27823 -version 591'

[23:31:04] 
[23:31:04] *------------------------------*
[23:31:04] Folding@Home Gromacs SMP Core
[23:31:04] Version 1.73 (November 27, 2006)
[23:31:04] 
[23:31:04] Preparing to commence simulation
[23:31:04] - Ensuring status. Please wait.
[23:31:21] - Looking at optimizations...
[23:31:21] - Working with standard loops on this execution.
[23:31:21] - Previous termination of core was improper.
[23:31:21] - Going to use standard loops.
[23:31:21] - Files status OK
[23:31:22] (decompressed 538.2 percent)
[23:31:22] 8 (decompressed 538.2 percent)
[23:31:23] - Starting from Entering M.D.
[23:31:23] acket
[23:31:23] 
[23:31:23] Project: 2Entering M.D.
[23:31:23] one 211, Gen 3)
[23:31:23] 
[23:31:23] Entering M.D.
[23:31:29] files
[23:31:29] n: Protein
[23:31:29] Writing local files
[23:31:30] ompleted 0 out of 500000 steps  (0 percent)


unitinfo:
Current Work Unit
-----------------
Name: Protein
Tag: -
Download time: April 29 23:30:59
Due time: May 3 23:30:59
Progress: 0% [__________]

current.xyz:
6111 Protein
 
did you delete the work dir and not the queue.dat?

I have had the dreaded 0 0/0/0 before caused by a hung wu at 100% and restarting the client before reading this
http://forum.folding-community.org/viewtopic.php?t=18036&sid=22ef3370feab12fb972c584757a01879.

Fortunately i copied the wu before stopping the client and restarting so i could still do a qfix. Qgen is more work if that's what it takes to recover it.

Not sure i would waste much time letting it fold as is. If qfix doesn't recover it it might be better to move the work and queue.dat to another folder to attempt repair, while you remove the current work dir and queue and restart the client. The assignment server will probably give you the same wu again anyway so you can start clean.
 
Thanks Pscout! qfix worked its magic! Its a P2609 I've got here.
 
This one is not listed....


[23:18:19] Folding@Home Gromacs Core
[23:18:19] Version 1.90 (March 8, 2006)
[23:18:19]
[23:18:19] Preparing to commence simulation
[23:18:19] - Ensuring status. Please wait.
[23:18:36] - Looking at optimizations...
[23:18:36] - Working with standard loops on this execution.
[23:18:36] - Previous termination of core was improper.
[23:18:36] - Files status OK
[23:18:37] - Expanded 291116 -> 1461493 (decompressed 502.0 percent)
[23:18:37] - Checksums don't match (work/wudata_08.xtc)
[23:18:37] - Starting from initial work packet
[23:18:37]
[23:18:37] Project: 3038 (Run 14, Clone 738, Gen 7)
[23:18:37]
[23:18:37] Entering M.D.
[23:18:50] Protein: p3038_supervillin-03
 
Back