PDA

View Full Version : SIGTERM signal (2) from OS & Bad WUs?


linksep
05-29-04, 01:09 AM
[05:05:49] Finished a frame (204)
[05:14:28] Finished a frame (205)
[05:25:03] Finished a frame (206)
[05:35:14] Finished a frame (207)
[05:42:39] Finished a frame (208)
[05:50:31] Finished a frame (209)
[05:54:45] Received unknown signal from OS. Ignoring...Received unknown signal from OS. Ignoring...***** Got a SIGTERM signal (2) from OS
Folding@home Client Shutdown.


--- Opening Log file [May 28 05:56:47]


# Windows Console Edition ################################################## ###
################################################## #############################

Folding@home Client Version 4.00

http://folding.stanford.edu

################################################## ##

Arguments: -service -forceSSE -verbosity 9

Warning:
By using the -forceSSE flag, you are overriding program
safeguards that monitor the stability of SSE
instructions on your system. If you did not intend
to do this, please restart the program without
-forceSSE. If work units are not completing fully,
then please discontinue use of the flag.

[05:56:47] - Ask before connecting: No
[05:56:47] - User name: Linksep (Team 32)
[05:56:47] - User ID = 5983965722C4211B
[05:56:47] - Machine ID: 1
[05:56:47]
[05:56:48] Loaded queue successfully.
[05:56:48] + Benchmarking ...
[05:56:50] The benchmark result is 7176
[05:56:50]
[05:56:50] - Autosending finished units...
[05:56:50] Trying to send all finished work units
[05:56:50] + No unsent completed units remaining.
[05:56:50] - Autosend completed
[05:56:50] + Processing work unit
[05:56:50] Core required: FahCore_65.exe
[05:56:50] Core found.
[05:56:50] Working on Unit 01 [May 28 05:56:50]
[05:56:50] + Working ...
[05:56:50] - Calling 'FahCore_65.exe -dir work/ -suffix 01 -checkpoint 15 -service -forceSSE -verbose -lifeline 1040 -version 400'

[05:56:51] Folding@home Client Core Version 2.52 (February 10, 2004)
[05:56:51]
[05:56:51] Proj: work/wudata_01
[05:56:51] Done: 23016 -> 143240 (decompressed 622.3 percent)
[05:56:51] nsteps: 5000000 dt: 2.000000 dt_dump: 250.000000 temperature: 444.000000
[05:56:51] xyzfile:
[05:56:51] " 397 p695_L939_WT_Nat_444K
[05:56:51] 1 N 57.382676 16.904458 -72.019975 ..."
[05:56:51] keyfile:
[05:56:51] "parameters ./proj695.prm
[05:56:51] NOVERSION
[05:56:51] ARCHIVE
[05:56:51]
[05:56:51] cutoff 16.0
[05:56:51] taper 12...."
[05:56:51]
[05:56:51] Hashes matched on file work/wudata_01.dyn
[05:56:51] VerifyARCFile: Checksums don't match (frame 1). Failed verification
[05:56:51] Starting from initial work packet
[05:56:51]
[05:56:51] Protein: p695_L939_WT_Nat_444K
[05:56:51] - Run: 17 (Clone 45, Gen 8)
[05:56:51] - Frames Completed: 0, Remaining: 400
[05:56:51] - Dynamic steps required: 5000000
[05:56:51]
[05:56:51] Writing local files:
[05:56:51]
[05:56:51] parameters work/wudata_01.prm
[05:56:51] - Writing "work/wudata_01.key": (overwrite) successful.
[05:56:51] - Writing "work/wudata_01.xyz": (overwrite) successful.
[05:56:51] - Writing "work/wudata_01.prm": (overwrite) successful.
[05:56:52] - Writing "work/wudata_01.key": (append) successful.
[05:56:52]
[05:56:52] PROJECT="work/wudata_01", NSTEPS=5000000, DT=2.0000, DTDUMP=25.000000, TEMP=444.00
[05:56:52] TINKER: Software Tools for Molecular Design
[05:56:52] Version 3.8 October 2000
[05:56:52] Copyright (c) Jay William Ponder 1990-2000
[05:56:52] portions Copyright (c) Michael Shirts 2001
[05:56:52] portions Copyright (c) Vijay S Pande 2001


--- Opening Log file [May 28 06:09:22]


# Windows Console Edition ################################################## ###
################################################## #############################

Folding@home Client Version 4.00

http://folding.stanford.edu

################################################## #############################
################################################## #############################

Arguments: -service -forceSSE -verbosity 9

Warning:
By using the -forceSSE flag, you are overriding program
safeguards that monitor the stability of SSE
instructions on your system. If you did not intend
to do this, please restart the program without
-forceSSE. If work units are not completing fully,
then please discontinue use of the flag.

[06:09:22] - Ask before connecting: No
[06:09:22] - User name: Linksep (Team 32)
[06:09:22] - User ID = 5983965722C4211B
[06:09:22] - Machine ID: 1
[06:09:22]
[06:09:22] Loaded queue successfully.
[06:09:22] + Benchmarking ...
[06:09:25] The benchmark result is 6240
[06:09:25]
[06:09:25] - Autosending finished units...
[06:09:25] Trying to send all finished work units
[06:09:25] + No unsent completed units remaining.
[06:09:25] - Autosend completed
[06:09:25] + Processing work unit
[06:09:25] Core required: FahCore_65.exe
[06:09:25] Core found.
[06:09:25] Working on Unit 01 [May 28 06:09:25]
[06:09:25] + Working ...
[06:09:25] - Calling 'FahCore_65.exe -dir work/ -suffix 01 -checkpoint 15 -service -forceSSE -verbose -lifeline 1028 -version 400'

[06:09:25] Folding@home Client Core Version 2.52 (February 10, 2004)
[06:09:25]
[06:09:25] Proj: work/wudata_01
[06:09:26] Done: 23016 -> 143240 (decompressed 622.3 percent)
[06:09:26] nsteps: 5000000 dt: 2.000000 dt_dump: 250.000000 temperature: 444.000000
[06:09:26] xyzfile:
[06:09:26] " 397 p695_L939_WT_Nat_444K
[06:09:26] 1 N 57.382676 16.904458 -72.019975 ..."
[06:09:26] keyfile:
[06:09:26] "parameters ./proj695.prm
[06:09:26] NOVERSION
[06:09:26] ARCHIVE
[06:09:26]
[06:09:26] cutoff 16.0
[06:09:26] taper 12...."
[06:09:26]
[06:09:26] - Couldn't get size info for dyn file: work/wudata_01.dyn
[06:09:26] Starting from initial work packet
[06:09:26]
[06:09:26] Protein: p695_L939_WT_Nat_444K
[06:09:26] - Run: 17 (Clone 45, Gen 8)
[06:09:26] - Frames Completed: 0, Remaining: 400
[06:09:26] - Dynamic steps required: 5000000
[06:09:26]
[06:09:26] Writing local files:
[06:09:26]
[06:09:26] parameters work/wudata_01.prm
[06:09:26] - Writing "work/wudata_01.key": (overwrite) successful.
[06:09:26] - Writing "work/wudata_01.xyz": (overwrite) successful.
[06:09:26] - Writing "work/wudata_01.prm": (overwrite) successful.
[06:09:26] - Writing "work/wudata_01.key": (append) successful.
[06:09:26]
[06:09:26] PROJECT="work/wudata_01", NSTEPS=5000000, DT=2.0000, DTDUMP=25.000000, TEMP=444.00
[06:09:27] TINKER: Software Tools for Molecular Design
[06:09:27] Version 3.8 October 2000
[06:09:27] Copyright (c) Jay William Ponder 1990-2000
[06:09:27] portions Copyright (c) Michael Shirts 2001
[06:09:27] portions Copyright (c) Vijay S Pande 2001
[06:16:44] Finished a frame (1)
[06:20:54]
[06:20:54] Received faulty work unit.
[06:21:04] logfile size: 230400
[06:21:04] - Writing 230912 bytes of core data to disk.
[06:21:04] end (WriteWorkResults)
[06:21:04]
[06:21:04] Folding@home Core Shutdown: BAD_WORK_UNIT
[06:21:06] CoreStatus = 72 (114)
[06:21:06] Sending work to server

[06:21:06] + Attempting to send results
[06:21:06] - Reading file work/wuresults_01.dat from core
[06:21:06] (Read 230912 bytes from disk)
[06:21:06] Connecting to http://171.64.122.119:8080/
[06:21:12] Initial: 0000; - Uploaded at ~37 kB/s
[06:21:12] - Averaged speed for that direction ~33 kB/s
[06:21:12] + Results successfully sent
[06:21:12] Thank you for your contribution to Folding@home.
[06:21:16] Trying to send all finished work units
[06:21:16] + No unsent completed units remaining.
[06:21:16] - Preparing to get new work unit...
[06:21:16] + Attempting to get work packet
[06:21:16] - Connecting to assignment server
[06:21:16] Connecting to http://assign.stanford.edu:8080/
[06:21:17] Initial: 43AB; - Successful: assigned to (171.67.89.148).
[06:21:17] + News From Folding@Home: Welcome to Folding@Home
[06:21:17] Loaded queue successfully.
[06:21:17] Connecting to http://171.67.89.148:8080/
[06:21:18] Initial: 0000; - Receiving payload (expected size: 350965)
[06:21:20] - Downloaded at ~171 kB/s
[06:21:20] - Averaged speed for that direction ~212 kB/s
[06:21:20] + Received work.
[06:21:20] Trying to send all finished work units
[06:21:20] + No unsent completed units remaining.
[06:21:20] + Closed connections
[06:21:25]
[06:21:25] + Processing work unit
[06:21:25] Core required: FahCore_78.exe
[06:21:25] Core found.
[06:21:25] Working on Unit 02 [May 28 06:21:25]
[06:21:25] + Working ...
[06:21:25] - Calling 'FahCore_78.exe -dir work/ -suffix 02 -checkpoint 15 -service -forceSSE -verbose -lifeline 1028 -version 400'

[06:21:25]
[06:21:25] *------------------------------*
[06:21:25] Folding@home Gromacs Core
[06:21:25] Version 1.64 (April 29, 2004)
[06:21:25]
[06:21:25] Preparing to commence simulation
[06:21:25] - Assembly optimizations manually forced on.
[06:21:25] - Not checking prior termination.
[06:21:26] - Expanded 350453 -> 1761865 (decompressed 502.7 percent)
[06:21:26] - Starting from initial work packet
[06:21:26]
[06:21:26] Project: 543 (Run 13, Clone 42, Gen 82)
[06:21:26]
[06:21:26] Assembly optimizations on if available.
[06:21:26] Entering M.D.
[06:21:38] Protein: p543_BBA5_ext
[06:21:38]
[06:21:38] Writing local files
[06:21:38] Extra SSE boost OK.
[06:21:38] Writing local files
[06:21:38] Completed 0 out of 500000 steps (0)
[06:28:56] Writing local files
[06:28:56] Completed 5000 out of 500000 steps (1)
[06:36:07] Writing local files
[06:36:07] Completed 10000 out of 500000 steps (2)

linksep
05-29-04, 01:22 AM
Anyone have any idea what the problem is/was? I had been defraging, but was defraging select folders (not system or F@H) using "DefragMentor Premium" and was using Win Media player when the problems happened.

That was about 120 points worth of tinker that was killed (235pt WU) in the middle of my battle with Gasoline... It woulda put me over the top. (Ok, maybe not, but I hate to see 120 points killed with no credit)

I am 46*c in BIOS during reboot (still hot from folding). I backed off my OC by 38Mhz untill I get this figured out...

Gasoline
05-29-04, 05:01 PM
If the CPU and memory are still proving to be stable at high overclock rates, then it could be the mobo's embedded disk controller corrupting the data written to disk due to the PCI bus speed running too far out of spec, since the IDE controllers are clocked from the PCI bus.

At whatever FSB speed you're at, can you tell what the divided-down resulting PCI bus speed is?

OLMI
05-29-04, 06:17 PM
I have seen that error on some of the computer I have folding as well. I have no idea what it is. :(

linksep
05-30-04, 01:39 AM
At whatever FSB speed you're at, can you tell what the divided-down resulting PCI bus speed is?

I have a A7N8X-Dlx Revision 2.00. I Can't remember if it's AGP or PCI but something is locked at 69Mhz

linksep
05-31-04, 08:36 PM
Anyone else have any ideas?
[/bump]

Gasoline
05-31-04, 11:41 PM
I have a A7N8X-Dlx Revision 2.00. I Can't remember if it's AGP or PCI but something is locked at 69Mhz

Are you sure it's not locked at 66MHz? That would be the AGP graphics bus. 69MHz is out of spec with that too, but graphics is usually a bit more forgiving on bus speed tolerance. The PCI bus needs 33MHz, and I've generally had bad luck with disk I/O and embedded sound interfaces too, whenever the PCI gets more than 2 or 3 MHz out of spec. Maybe your mobo does lock the PCI and AGP in spec no matter what the FSB core speed is.... a lot of newer mobos can do that.

Mr. Chambers
06-01-04, 02:31 AM
thats an nforce2 chipset mobo, so it locks the pci/agp clockrates at 33/66, respectively - regardless of FSB speed.

first step would be to turn your OC down or off alittle, then see if the problems go away. unstable overclocks can play havoc with folding.

walaka7
06-01-04, 08:22 AM
come to think of it.. i had same prbolem a few weeks ago. The fix? downloaded new exe and deleted teh old one. I thought it very strange as i had never had a tinker wu fail on me before. IT was always a gromac that failed, especially back in the 3.x days. anyhow, IF nothing has really changed in your OC lately and temps have been on par, you might do that first before raping your oc. Good luck to ya.