View Full Version : Help with vm/smp
I have vmware server running on ubuntu32 as a host. It's guest is ubuntu 64. Both versions are 7.10 w/ ia32-libs installed. When running the client I get this strange error.
http://home.comcast.net/~cj145/help.png
I thought it might be something with how the proc is detected in vmware due to the "address family not supported by protocol" but I checked /proc/cpuinfo and it shows as Intel(R) Core(TM)2 CPU.
I'm running with the -smp -forceasm -verbosity 9 flags.
The work unit that it has is a 2605.
William Hung
02-01-08, 12:00 PM
Wow, I haven't seen this one before. Have you tried deleting the WU and getting another?
NedClocker
02-01-08, 12:06 PM
Dood! I was just about to post the exact same problem!!!
Maybe there's something wrong with Stanford's server? The program says "2 cores detected"!
Stanford is sending us NON-SMP work units but the program is expecting an SMP work unit because we used the -smp flag.
jeff@ned-desktop-VM1:~$ cd ~/folding/FAH
jeff@ned-desktop-VM1:~/folding/FAH$ ./fah6 -smp -forceasm -verbosity 9
Note: Please read the license agreement (fah6 -license). Further
use of this software requires that you have read and accepted this agreement.
2 cores detected
--- Opening Log file [February 1 19:09:40]
# SMP Client ################################################## ################
################################################## #############################
Folding@Home Client Version 6.00beta1
http://folding.stanford.edu
################################################## #############################
################################################## #############################
Launch directory: /home/jeff/folding/FAH
Executable: ./fah6
Arguments: -smp -forceasm -verbosity 9
Warning:
By using the -forceasm flag, you are overriding
safeguards in the program. If you did not intend to
do this, please restart the program without -forceasm.
If work units are not completing fully (and particularly
if your machine is overclocked), then please discontinue
use of the flag.
[19:09:40] - Ask before connecting: No
[19:09:40] - User name: Ned_Clocker (Team 32)
[19:09:40] - User ID: 5345A4CA45D1A65E
[19:09:40] - Machine ID: 1
[19:09:40]
[19:09:40] Loaded queue successfully.
[19:09:40] Deleting incompletely fetched item (4) from queue position #1
[19:09:40] - Warning: Could not delete all work unit files (1): Core file absent
[19:09:40] - Preparing to get new work unit...
[19:09:40] + Attempting to get work packet
[19:09:40] - Will indicate memory of 498 MB
[19:09:40] - Detect CPU. Vendor: GenuineIntel, Family: 6, Model: 15, Stepping: 8
[19:09:40] - Connecting to assignment server
[19:09:40] Connecting to http://assign.stanford.edu:8080/
[19:09:40] - Autosending finished units...
[19:09:40] Trying to send all finished work units
[19:09:41] + Attempting to send results
[19:09:41] - Reading file work/wuresults_00.dat from core
[19:09:41] Posted data.
[19:09:41] Initial: 40AB; - Successful: assigned to (171.64.65.56).
[19:09:41] + News From Folding@Home: Welcome to Folding@Home
[19:09:41] (Read 5517519 bytes from disk)
[19:09:41] Connecting to http://171.64.65.56:8080/
[19:09:41] Loaded queue successfully.
[19:09:41] Connecting to http://171.64.65.56:8080/
[19:09:44] Posted data.
[19:09:44] Initial: 0000; - Receiving payload (expected size: 2173192)
[19:10:29] - Downloaded at ~47 kB/s
[19:10:29] - Averaged speed for that direction ~119 kB/s
[19:10:29] + Received work.
[19:10:29] + Closed connections
[19:10:29]
[19:10:29] + Processing work unit
[19:10:29] Core required: FahCore_a1.exe
[19:10:29] Core found.
[19:10:30] Working on Unit 01 [February 1 19:10:30]
[19:10:30] + Working ...
[19:10:30] - Calling './mpiexec -np 4 -host 127.0.0.1 ./FahCore_a1.exe -dir work/ -suffix 01 -checkpoint 15 -forceasm -verbose -lifeline 5111 -version 600'
[19:10:30]
[19:10:30] *------------------------------*
[19:10:30] Folding@Home Gromacs SMP Core
[19:10:30] Version 1.74 (November 27, 2006)
[19:10:30]
[19:10:30] Preparing to commence simulation
[19:10:30] - Assembly optimizations manually forced on.
[19:10:30] - Not checking prior termination.
[19:10:31] - Expanded 2172680 -> 12887285 (decompressed 593.1 percent)
[19:10:32] - Starting from initial work packet
[19:10:32]
[19:10:32] Project: 2605 (Run 1, Clone 75, Gen 0)
[19:10:32]
[19:10:32] Assembly optimizations on if available.
[19:10:32] Entering M.D.
NNODES=4, MYRANK=1, HOSTNAME=ned-desktop-VM1
NNODES=4, MYRANK=2, HOSTNAME=ned-desktop-VM1
NNODES=4, MYRANK=0, HOSTNAME=ned-desktop-VM1
NNODES=4, MYRANK=3, HOSTNAME=ned-desktop-VM1
NODEID=0 argc=15
NODEID=2 argc=15
NODEID=3 argc=15
NODEID=1 argc=15
Written by David van der Spoel, Erik Lindahl, Berk Hess, and others.
Copyright (c) 1991-2000, University of Groningen, The Netherlands.
Copyright (c) 2001-2004, The GROMACS development team,
check out http://www.gromacs.org for more information.
This inclusion of Gromacs code in the Folding@Home Core is under
a special license (see http://folding.stanford.edu/gromacs.html)
specially granted to Stanford by the copyright holders. If you
are interested in using Gromacs, visit www.gromacs.org where
you can download a free version of Gromacs under
the terms of the GNU General Public License (GPL) as published
by the Free Software Foundation; either version 2 of the License,
or (at your option) any later version.
[19:10:38] Rejecting checkpoint
run input file work/wudata_01.xyz was made for 1 nodes,
while Core_A1.exe expected it to be for 4 nodes.: Address family not supported by protocol
Error on node 0, will try to stop all the nodes
[19:10:39]
[19:10:39] Folding@home Core Shutdown: UNKNOWN_ERROR
[0]0:Return code = 97
[0]1:Return code = 0, signaled with Quit
[0]2:Return code = 0, signaled with Quit
[0]3:Return code = 0, signaled with Quit
[19:10:50] CoreStatus = 61 (97)
[19:10:50] + Client running with incorrect SMP settings for work unit. Please check settings and restart
[19:12:00] Posted data.
[19:12:00] Initial: 0000; - Uploaded at ~38 kB/s
[19:12:01] - Averaged speed for that direction ~38 kB/s
[19:12:01] + Results successfully sent
[19:12:01] Thank you for your contribution to Folding@Home.
[19:12:01] + Number of Units Completed: 40
[19:12:08] + Sent 1 of 1 completed units to the server
[19:12:08] - Autosend completed
NedClocker
02-01-08, 12:10 PM
Wow, I haven't seen this one before. Have you tried deleting the WU and getting another?
Haven't tried that yet. It just downloaded this one!
Both VMs went all night w/o being able to get work. I restarted both of them this morning and that result is what I got on both VMs.
The error code is no help. What that means is:
61
Folding@home Core Shutdown: UNKNOWN_ERROR
CoreStatus = 61 (97)
+ Client running with incorrect SMP settings for work unit. Please check settings and restart
This errors occurs when the -smp switch is not used when restarting the v6 SMP client in the middle of a work unit. The first example as from a Linux SMP Beta client.
which shouldn't be the problem, if you used that initalization string that you posted.
I am going to go with William on this one - you've got a bad WU.
If that doesn't work out, I'll just blame William, of course! :D :clap: :)
I did try deleting the wu (twice now).
run input file work/wudata_01.xyz was made for 1 nodes,
while Core_A1.exe expected it to be for 4 nodes.
It does seem like that could be the explanation Ned. The files also seem a bit small for a SMP work unit.
Edit: Check this new msg out. (found in work/wudata_010.log
Program Core_A1.exe, VERSION 3.3
Source code file: tpxio.c, line: 1153
Fatal error:
Can not read file work/wudata_01.xyz
this file is from a Gromacs version which is older than 2.0
Make a new one with grompp or use a gro or pdb file, if possible
-----------------------------------------------------------
Thanx for Using GROMACS - Have a Nice Day
Hi guys
I had the same problem. Try removing the -smp flag. It worked fom me
Hi guys
I had the same problem. Try removing the -smp flag. It worked fom me
I think they must have messed something up in the new client? I'm now working on an smp work unit without the flag on but, when it's done it will get a normal work unit?
NedClocker
02-01-08, 01:15 PM
Hi guys
I had the same problem. Try removing the -smp flag. It worked fom me
It's not an SMP work unit!!! Yes, it'll run if you don't use the SMP flag!
I'd rather be folding SMP units, though.
Yes, it will download another NON-SMP work unit when that one finishes. You could use the -oneunit flag though.
I don't know, I guess we will find out soon.
NedClocker
02-01-08, 01:17 PM
I think they must have messed something up in the new client? I'm now working on an smp work unit without the flag on but, when it's done it will get a normal work unit?
No, I haven't downloaded the new client yet.
Are you SURE you're working on an smp work unit? I don't think it'll do that without the flag on.
Weird.
It's folding with the FahCore_a1.exe but only running one thread.
Labeled as a P2605.
Reports 50,000 frames.
NedClocker
02-01-08, 01:20 PM
Hey! Try deleting the WU using ./fah6 -delete x. Then run it again. That worked for me.
Run ./fah6 -queueinfo first to find out which unit to delete!!!
NedClocker
02-01-08, 01:21 PM
Weird.
It's folding with the FahCore_a1.exe but only running one thread.
Labeled as a P2605.
Reports 50,000 frames.
Yes, the -smp flag tells it to use 4 cores, otherwise it uses only 1.
NedClocker
02-01-08, 01:24 PM
Let me know if that works for y'all. K?
I'm copying 500GB of stuff outside this vm atm. It's stuck on "Deleting work unit #1 from work queue..."
I will update this when it deletes it finally. >.>
Edit1: Failed to delete the requested work unit.
Going to wipe the fah folder and hope that works.
Edit2: wiped clean and started again. Got a new P2605 and it's running fine with 4 threads.
A bad batch of p2605s was loaded on the server. They've been removed and are reprocessed. If as someone suggested you remove the -smp flag, you'll quit getting smp work.
Oh so that's what happened. So what is Stanford going to screw up next?
WarriorII
02-01-08, 02:03 PM
nm.
So now that I have this working, is there a way to nice the vmware processes to 19 automatically? Every time I boot it I have to manually nice them all because it lags XBMC to death at -10.
Gee, and I was about to switch my quads to VM's. Guess i'll wait a bit and see.
Oh so that's what happened. So what is Stanford going to screw up next?
Never tempt the Gods! :)
NedClocker
02-01-08, 05:25 PM
Gee, and I was about to switch my quads to VM's. Guess i'll wait a bit and see.
My VMs have folded 40 WU each without a hitch until this happened! I think you'll be safe! You sure will not have to worry about folding simply stopping just because of a network hickup.
I have no regrets about switching. :D
ihrsetrdr
02-02-08, 03:12 AM
Gee, and I was about to switch my quads to VM's. Guess i'll wait a bit and see.
I'm 'inching' my way closer to switching(these things take time), just a while ago I downloaded the VM server for Linux & windows plus enough serial #'s go take care of all the quads. ;)
I am glad to report that running SMP inside a vm with linux host and guest only seems to have lost me about 80ppd on a 2605 on a 3GHz c2d vs native linux.
NedClocker
02-02-08, 01:31 PM
Yeah, if you have a linux host machine, I don't think you will see any benefit to running MVWare on a C2D.
Maybe on a C2Q you would, though. A quad linux box is going to get assigned to the quad only server when it gets work. From what I've read, those work units are not worth as much in ppd.
Is that right, Chas? :D
The big benefit to running VMWare comes when you are running it on a Windows box.
William Hung
02-02-08, 02:42 PM
Agreed, quad only units are not as good for PPD... about 15% less...
I have to run it on a vm atm because the other software that i run (XBMC) doesn't work on 64bit at all atm. I spent a good 2 days working with it and a chroot trying to get it to run chrooted and couldn't keep it working well.
There would be no advantage to running vmware on a C2D linux box unless it had to run 32 bit linux as the host, as CJ has to. VMware overhead is on the order of 5% to 7%. There is a 100ppd/Ghz advantage to running VMware/Linux SMP over Windows SMP on a C2D. I succesfully installed VMware with Ubuntu 7.10 as host running two Ubuntu 7.10 VMs and get 4440 ppd on two p2653 on a Q6600 @ 3.33 GHz
NedClocker
02-05-08, 08:55 AM
So, it IS advantageous to run VMWare on a quad with Ubuntu as host??? Is that because it avoids the quad only server?
So, it IS advantageous to run VMWare on a quad with Ubuntu as host??? Is that because it avoids the quad only server?
Absolutely correct.
NedClocker
02-05-08, 09:45 PM
Great! Thank you! I didn't know that! :)
vBulletin® v3.8.7, Copyright ©2000-2012, vBulletin Solutions, Inc.