Surferseth
01-30-10, 01:05 PM
Commonly seeing this error with my Ubuntu VM running -bigadv WUs. Have 4600MB of RAM dedicated to the VM.
seth@seth-desktop:~/folding$ ./fah
Note: Please read the license agreement (fah6 -license). Further
use of this software requires that you have read and accepted this agreement.
8 cores detected
--- Opening Log file [January 30 06:19:35 UTC]
# Linux SMP Console Edition ################################################## #
################################################## #############################
Folding@Home Client Version 6.24R3
http://folding.stanford.edu
################################################## #############################
################################################## #############################
Launch directory: /home/seth/folding
Executable: ./fah6
Arguments: -smp 8 -bigadv -verbosity 9
seth@seth-desktop:~/folding$ [06:19:35] - Ask before connecting: No
[06:19:35] - User name: surferseth (Team 32)
[06:19:35] - User ID: 6FC4A9AC0DD3ABF0
[06:19:35] - Machine ID: 1
[06:19:35]
[06:19:35] Loaded queue successfully.
[06:19:35]
[06:19:35] + Processing work unit
[06:19:35] Core required: FahCore_a2.exe
[06:19:35] Core found.
[06:19:35] - Autosending finished units... [January 30 06:19:35 UTC]
[06:19:35] Trying to send all finished work units
[06:19:35] + No unsent completed units remaining.
[06:19:35] - Autosend completed
[06:19:35] Working on queue slot 03 [January 30 06:19:35 UTC]
[06:19:35] + Working ...
[06:19:35] - Calling './mpiexec -np 8 -host 127.0.0.1 ./FahCore_a2.exe -dir work/ -nice 19 -suffix 03 -checkpoint 15 -verbose -lifeline 9962 -version 624'
[06:19:35]
[06:19:35] *------------------------------*
[06:19:35] Folding@Home Gromacs SMP Core
[06:19:35] Version 2.10 (Sun Aug 30 03:43:28 CEST 2009)
[06:19:35]
[06:19:35] Preparing to commence simulation
[06:19:35] - Looking at optimizations...
[06:19:35] - Working with standard loops on this execution.
[06:19:35] - Files status OK
[06:19:45] is execution.
[06:19:45] - Files status OK
[06:21:33] (decompressed 101.8 percent)
[06:21:34] 49 (decompressed 101.8 percent)
[06:21:57] ressed_data_size=30327709 data_size=159726549, decompressed_data_size=159726549 diff=0
[06:21:57] ssed_data_size=159726549 diff=0
[06:21:59] - Digital signature verified
[06:21:59]
[06:21:59] Project: 2681 (Run 5, Clone 10, Gen 67)
[06:21:59]
[06:22:41] Entering M.D.
[06:22:47] Using Gromacs checkpoints
NNODES=8, MYRANK=0, HOSTNAME=seth-desktop
NODEID=0 argc=23
NNODES=8, MYRANK=1, HOSTNAME=seth-desktop
NNODES=8, MYRANK=2, HOSTNAME=seth-desktop
NODEID=2 argc=23
NNODES=8, MYRANK=3, HOSTNAME=seth-desktop
NODEID=3 argc=23
NNODES=8, MYRANK=4, HOSTNAME=seth-desktop
NODEID=4 argc=23
NNODES=8, MYRANK=5, HOSTNAME=seth-desktop
NODEID=5 argc=23
NNODES=8, MYRANK=6, HOSTNAME=seth-desktop
NODEID=6 argc=23
NNODES=8, MYRANK=7, HOSTNAME=seth-desktop
NODEID=7 argc=23
NODEID=1 argc=23
Reading file work/wudata_03.tpr, VERSION 3.3.99_development_20070618 (single precision)
Note: tpx file_version 48, software version 68
Reading checkpoint file work/wudata_03.cpt generated: Fri Jan 29 05:24:43 2010
NOTE: The tpr file used for this simulation is in an old format, for less memory usage and possibly more performance create a new tpr file with an up to date version of grompp
Making 1D domain decomposition 8 x 1 x 1
starting mdrun 'SINGLE VESICLE in water'
17000001 steps, 68000.0 ps (continuing from step 16938345, 67753.4 ps).
[06:23:14] data_03.log
[06:23:14] Verified work/wudata_03.trr
[06:23:16] Verified work/wudata_03.xtc
[06:23:16] Verified work/wudata_03.edr
[06:24:05] Completed 188344 out of 250000 steps (75%)
[06:47:42] Completed 190000 out of 250000 steps (76%)
[07:22:49] Completed 192500 out of 250000 steps (77%)
[07:57:49] Completed 195000 out of 250000 steps (78%)
[08:32:42] Completed 197500 out of 250000 steps (79%)
[09:07:42] Completed 200000 out of 250000 steps (80%)
[09:42:47] Completed 202500 out of 250000 steps (81%)
[10:17:34] Completed 205000 out of 250000 steps (82%)
t = 67823.363 ps: Water molecule starting at atom 904089 can not be settled.
Check for bad contacts and/or reduce the timestep.
t = 67823.363 ps: Water molecule starting at atom 876966 can not be settled.
Check for bad contacts and/or reduce the timestep.
[10:29:20]
[10:29:20] Folding@home Core Shutdown: INTERRUPTED
application called MPI_Abort(MPI_COMM_WORLD, 102) - process 0
[0]0:Return code = 102
[0]1:Return code = 0, signaled with Quit
[0]2:Return code = 0, signaled with Segmentation fault
[0]3:Return code = 0, signaled with Segmentation fault
[0]4:Return code = 0, signaled with Quit
[0]5:Return code = 0, signaled with Quit
[0]6:Return code = 0, signaled with Quit
[0]7:Return code = 0, signaled with Quit
[10:29:28] CoreStatus = 66 (102)
[10:29:28] + Shutdown requested by user. Exiting.***** Got a SIGTERM signal (15)
[10:29:28] Killing all core threads
Folding@Home Client Shutdown.
This error is somewhat random. For days it will fold away finishing WU after WU, and then it just seems to become unstable. Any ideas?:confused:
seth@seth-desktop:~/folding$ ./fah
Note: Please read the license agreement (fah6 -license). Further
use of this software requires that you have read and accepted this agreement.
8 cores detected
--- Opening Log file [January 30 06:19:35 UTC]
# Linux SMP Console Edition ################################################## #
################################################## #############################
Folding@Home Client Version 6.24R3
http://folding.stanford.edu
################################################## #############################
################################################## #############################
Launch directory: /home/seth/folding
Executable: ./fah6
Arguments: -smp 8 -bigadv -verbosity 9
seth@seth-desktop:~/folding$ [06:19:35] - Ask before connecting: No
[06:19:35] - User name: surferseth (Team 32)
[06:19:35] - User ID: 6FC4A9AC0DD3ABF0
[06:19:35] - Machine ID: 1
[06:19:35]
[06:19:35] Loaded queue successfully.
[06:19:35]
[06:19:35] + Processing work unit
[06:19:35] Core required: FahCore_a2.exe
[06:19:35] Core found.
[06:19:35] - Autosending finished units... [January 30 06:19:35 UTC]
[06:19:35] Trying to send all finished work units
[06:19:35] + No unsent completed units remaining.
[06:19:35] - Autosend completed
[06:19:35] Working on queue slot 03 [January 30 06:19:35 UTC]
[06:19:35] + Working ...
[06:19:35] - Calling './mpiexec -np 8 -host 127.0.0.1 ./FahCore_a2.exe -dir work/ -nice 19 -suffix 03 -checkpoint 15 -verbose -lifeline 9962 -version 624'
[06:19:35]
[06:19:35] *------------------------------*
[06:19:35] Folding@Home Gromacs SMP Core
[06:19:35] Version 2.10 (Sun Aug 30 03:43:28 CEST 2009)
[06:19:35]
[06:19:35] Preparing to commence simulation
[06:19:35] - Looking at optimizations...
[06:19:35] - Working with standard loops on this execution.
[06:19:35] - Files status OK
[06:19:45] is execution.
[06:19:45] - Files status OK
[06:21:33] (decompressed 101.8 percent)
[06:21:34] 49 (decompressed 101.8 percent)
[06:21:57] ressed_data_size=30327709 data_size=159726549, decompressed_data_size=159726549 diff=0
[06:21:57] ssed_data_size=159726549 diff=0
[06:21:59] - Digital signature verified
[06:21:59]
[06:21:59] Project: 2681 (Run 5, Clone 10, Gen 67)
[06:21:59]
[06:22:41] Entering M.D.
[06:22:47] Using Gromacs checkpoints
NNODES=8, MYRANK=0, HOSTNAME=seth-desktop
NODEID=0 argc=23
NNODES=8, MYRANK=1, HOSTNAME=seth-desktop
NNODES=8, MYRANK=2, HOSTNAME=seth-desktop
NODEID=2 argc=23
NNODES=8, MYRANK=3, HOSTNAME=seth-desktop
NODEID=3 argc=23
NNODES=8, MYRANK=4, HOSTNAME=seth-desktop
NODEID=4 argc=23
NNODES=8, MYRANK=5, HOSTNAME=seth-desktop
NODEID=5 argc=23
NNODES=8, MYRANK=6, HOSTNAME=seth-desktop
NODEID=6 argc=23
NNODES=8, MYRANK=7, HOSTNAME=seth-desktop
NODEID=7 argc=23
NODEID=1 argc=23
Reading file work/wudata_03.tpr, VERSION 3.3.99_development_20070618 (single precision)
Note: tpx file_version 48, software version 68
Reading checkpoint file work/wudata_03.cpt generated: Fri Jan 29 05:24:43 2010
NOTE: The tpr file used for this simulation is in an old format, for less memory usage and possibly more performance create a new tpr file with an up to date version of grompp
Making 1D domain decomposition 8 x 1 x 1
starting mdrun 'SINGLE VESICLE in water'
17000001 steps, 68000.0 ps (continuing from step 16938345, 67753.4 ps).
[06:23:14] data_03.log
[06:23:14] Verified work/wudata_03.trr
[06:23:16] Verified work/wudata_03.xtc
[06:23:16] Verified work/wudata_03.edr
[06:24:05] Completed 188344 out of 250000 steps (75%)
[06:47:42] Completed 190000 out of 250000 steps (76%)
[07:22:49] Completed 192500 out of 250000 steps (77%)
[07:57:49] Completed 195000 out of 250000 steps (78%)
[08:32:42] Completed 197500 out of 250000 steps (79%)
[09:07:42] Completed 200000 out of 250000 steps (80%)
[09:42:47] Completed 202500 out of 250000 steps (81%)
[10:17:34] Completed 205000 out of 250000 steps (82%)
t = 67823.363 ps: Water molecule starting at atom 904089 can not be settled.
Check for bad contacts and/or reduce the timestep.
t = 67823.363 ps: Water molecule starting at atom 876966 can not be settled.
Check for bad contacts and/or reduce the timestep.
[10:29:20]
[10:29:20] Folding@home Core Shutdown: INTERRUPTED
application called MPI_Abort(MPI_COMM_WORLD, 102) - process 0
[0]0:Return code = 102
[0]1:Return code = 0, signaled with Quit
[0]2:Return code = 0, signaled with Segmentation fault
[0]3:Return code = 0, signaled with Segmentation fault
[0]4:Return code = 0, signaled with Quit
[0]5:Return code = 0, signaled with Quit
[0]6:Return code = 0, signaled with Quit
[0]7:Return code = 0, signaled with Quit
[10:29:28] CoreStatus = 66 (102)
[10:29:28] + Shutdown requested by user. Exiting.***** Got a SIGTERM signal (15)
[10:29:28] Killing all core threads
Folding@Home Client Shutdown.
This error is somewhat random. For days it will fold away finishing WU after WU, and then it just seems to become unstable. Any ideas?:confused: