PDA

View Full Version : Could not connect to work server error


mateo88
01-01-07, 01:50 PM
The program keeps telling me that it is attempting to get a work packet, however it cannot connect to the work server. I am behind a router, and the ports are forwarded. F@H downloaded the first WU, completed it, and uploaded it fine, however it can't get new ones. Any ideas? :shrug:

Shelnutt2
01-01-07, 01:55 PM
A) Did you just upgrade to IE7 since the first WU?
-If so go into the config and tell it to not use IE setting
B) If you are not using IE7, then its a Stanford Issue. They are rewiring the electrical in their server room so there might be slight down times with certain servers..although the work server should not have been affected.

mateo88
01-01-07, 02:13 PM
Hmmm... I haven't even touched IE in about 3 years. Is anyone else getting this message? The tray icon says it's attempting, but the full screen mode says it can't connect.

If it helps to diagnose the problem, this is what the log file just says each time it attempts:


[20:09:45] + Attempting to get work packet
[20:09:45] - Connecting to assignment server
[20:09:45] - Successful: assigned to (171.64.122.133).
[20:09:45] + News From Folding@Home: Welcome to Folding@Home
[20:09:45] Loaded queue successfully.
[20:09:46] + Could not connect to Work Server
[20:09:46] - Error: Attempt #1 to get work failed, and no other work to do.
Waiting before retry.

pscout
01-01-07, 02:31 PM
I have had a few rigs go into retries lately to get a new wu ... not sure if it is some leftover effect from the power upgrades and server outages that Stanford went through late latst week.

If it was working before, and nothing is changed on your end then it is likely a problem at your isp or at stanford.

I have shutdown and restarted clients a few times lately when i have noticed this ... it speeds up the next retry and has usually been successful in getting a new wu right away.

I have also noticed a few wu's being turned in to the collection server lately which means their normal server is down or too busy.

So hopefully this is just a transient problem.

If it persists let us know.

ihrsetrdr
01-01-07, 02:33 PM
Hmmm... I haven't even touched IE in about 3 years. Is anyone else getting this message? The tray icon says it's attempting, but the full screen mode says it can't connect.

If it helps to diagnose the problem, this is what the log file just says each time it attempts:


[20:09:45] + Attempting to get work packet
[20:09:45] - Connecting to assignment server
[20:09:45] - Successful: assigned to (171.64.122.133).
[20:09:45] + News From Folding@Home: Welcome to Folding@Home
[20:09:45] Loaded queue successfully.
[20:09:46] + Could not connect to Work Server
[20:09:46] - Error: Attempt #1 to get work failed, and no other work to do.
Waiting before retry.

That server may just be busy; happened to me yesterday. After 7 attempts it picked up a wu off another server.

mateo88
01-01-07, 02:35 PM
It tried to connect 24 times before I restarted the program in hope of a fix. Oh well. If there's nothing I can do, there's no point in trying to fix it. Hopefully it'll get going again soon.

pscout
01-01-07, 02:51 PM
Are you running the GUI client?

Not sure if that could have anything to do with it ... few of us run it cuz of past prooblems, so i don't have any experience to help with if it is part of the problem. If it is the gui, a reboot of windoze might help :shrug:

muddocktor
01-01-07, 05:06 PM
I think I see the problem you are having just by seeing which server it's trying to connect to, mateo88. Server 133 is an old timeless tinker server and since Stanford has no timeless wu's being sent out, you can't get any work. Somehow you set the client up to look for timeless work units and when that is set, you stand very little chance of getting work. If you are running the GUI client, I think there is a section in advanced properties (right click the open screen) that will let you adjust whether you want to set the client for timeless or folding only. Set it for folding only. I'm going by memory here as I haven't used the GUI client in ages. If you are using the console client, then make a shortcut and set the -configonly flag in the target, then start the client with the new shortcut. This will let you reset the client configuration. When you get to the part about advanced options then answer yes and you will eventually get to the question on if you want to set the client for timeless work, which you will answer no to.

ihrsetrdr
01-01-07, 06:15 PM
Muddocktor hit it right on the dime...here's the skinny (http://forum.folding-community.org/portal.php?topic_id=16235)

mateo88
01-02-07, 01:02 AM
i <3 muddocktor


I definitely set it to get work without deadlines. I wasn't really sure what kind of deadlines it was talking about, and I didn't want to have my system turned off for a while and miss one. I switched it back about six seconds ago and it works wonderfully now. Thanks!

muddocktor
01-02-07, 01:19 AM
No problem man. :)

orion456
01-03-07, 01:10 AM
I have reset my client.cfgs as you advised above but FAH502 still repeatedly refuses to send a number of completed WUs. This is happening on 3 different WINXP machines.

Here is part of the log file in which units 03, 05, and 09 can't be sent and are sitting the work directory queue. Unit 05 was finished in November.

[06:47:54] Couldn't send HTTP request to server (wininet)
[06:47:54] + Could not connect to Work Server (results)
[06:47:54] (171.65.103.160:8080)
[06:47:54] - Error: Could not transmit unit 05 (completed November 28) to work server.


[06:47:54] + Attempting to send results
[06:47:54] Error: Got status code 503 from server
[06:47:54] + Could not connect to Work Server (results)
[06:47:54] (171.65.103.100:8080)
[06:47:54] Could not transmit unit 05 to Collection server; keeping in queue.


[06:47:54] + Attempting to send results
[06:48:15] Couldn't send HTTP request to server (wininet)
[06:48:15] + Could not connect to Work Server (results)
[06:48:15] (171.65.103.151:8080)
[06:48:15] - Error: Could not transmit unit 09 (completed December 28) to work server.


[06:48:15] + Attempting to send results
[06:48:15] Error: Got status code 503 from server
[06:48:15] + Could not connect to Work Server (results)
[06:48:15] (171.65.103.100:8080)
[06:48:15] Could not transmit unit 09 to Collection server; keeping in queue


Any theories about how to get these WUs out. I think I'm up to 7 now that won't go out.

ChasR
01-03-07, 11:21 AM
@orion
[06:47:54] Error: Got status code 503 from server
[06:47:54] + Could not connect to Work Server (results)
[06:47:54] (171.65.103.100:8080)
[06:47:54] Could not transmit unit 05 to Collection server; keeping in queue.


[06:47:54] + Attempting to send results
[06:48:15] Couldn't send HTTP request to server (wininet)

is typically associated with the use of IE settings (Edit or incorrect proxy settings WRONG). Configuring use IE settings to no and use proxy to no usually fixes things. I'll post the standard connectivity test in a bit.

ChasR
01-03-07, 12:19 PM
Heres the standard connectivity test:

What happens when you click the following links?

http://www.stanford.edu/group/pandegroup/
http://assign2.stanford.edu/
http://171.65.103.100
http://assign.stanford.edu:8080/

You should get the Pandegroup site on the first, and in IE a screen saying OK on the last three. If you don't get the OK, something is blocking the connection.

Adak
01-03-07, 07:31 PM
Error code 503 is the indication the server is busy, or down. You ARE connecting with the server (otherwise you'd not have an error code from it). It just isn't set up to yak with your system, at this time.

One thing to check is if you accidentally signed up in your configuration, to request "deadlineless" (aka timeless tinkers), WU's. If so, you'll get nothing but connection errors when you try to get a new WU, since tinker's have been stopped.

If you're not sure just what you specified in the original configuration questions on this, just start the FAH client program with the " -config" (one space, one hyphen with no space after it, and "config" or "configonly" after the hypen).

If you don't know how to set up flags like this for the console version, you can just delete the client.cfg file, and it will force the FAH client to re-start it's configuration function, just like the first time you started it.

No folding work will be lost. Just close it always with "Cntrl + C", and never via the "X" in the upper right hand corner of it's window.

If that doesn't solve it, let me know and I'll dig into this in more detail. We want you folding! :D

Adak

orion456
01-03-07, 08:49 PM
http://171.65.103.100/


returns a blank screen....the others return OK.

orion456
01-03-07, 08:58 PM
One thing to check is if you accidentally signed up in your configuration, to request "deadlineless" (aka timeless tinkers), WU's. If so, you'll get nothing but connection errors when you try to get a new WU, since tinker's have been stopped.Adak

Here's the configuration file:

[settings]
username=orion456
team=32
asknet=no
machineid=1
local=1

[http]
active=no
host=localhost
port=8080
usereg=no

[core]
checkpoint=10

[clienttype]
type=1

I did have one of two clients set for no-preference and the other set to no, but both of them have WUs that haven't been sent and keep being refused. At time it also says in the log that no work could be accessed.

orion456
01-03-07, 09:27 PM
Thanks for your help guys.

Ok, looks like you helped me fix the problem of not getting my WUs sent, but now I have another mystery.

Looking at the log file for Jan 3, it seems the backloged WUs were sent this morning and successfully received. As I read these logs it means that 4 WUs were sent after midnight (since the times are UT and I am -6 for local time). But if you look here:

http://folding.extremeoverclocking.com/user_summary.php?s=&u=90633

I only have one WUs registered after midnight. Why might that be?

-------------fah1 ------------------
[06:47:54] - Error: Could not transmit unit 05 (completed November 28) to work server.
[06:49:44] + Attempting to send results
[06:50:42] + Results successfully sent

[07:03:48] - Error: Could not transmit unit 09 (completed December 28) to work server.
[07:03:48] + Attempting to send results
[07:04:02] + Results successfully sent

--------------- fah2 -----------------

[06:47:48] - Error: Could not transmit unit 00 (completed December 22) to work server.
[06:48:27] - Error: Could not transmit unit 03 (completed November 3) to work server.

[06:49:59] + Attempting to send results
[06:51:19] + Results successfully sent

[06:51:19] + Attempting to send results
[06:51:54] + Results successfully sent

orion456
01-03-07, 09:35 PM
Also what are .xtc files? I have a few megs of those files in my working directory but listed as 02 and 03 while it is working on 04.

Adak
01-03-07, 10:06 PM
Also what are .xtc files? I have a few megs of those files in my working directory but listed as 02 and 03 while it is working on 04.

What FAH client are you folding with, Orion?

The only FAH client I know of with .xtc files is the new SMP client. I have them in my folding / work directory, but have no knowledge of what they're for.

Reading the FAH forum, I might hazard a guess that they're related to the "jump" algorithm being tried out. If so, it just helps the simulated protein to "jump" into a more productive folding environment, by suddenly increasing the relevant heat energy. And that's just a wild guess on my part, so stand ready to throw that idea out with the garbage. :)

If indeed you're folding with either high performance (GPU or SMP) client, the reason you may not see any points for your WU's is that the WU may have been returned after it's final deadline. I'm unsure about the GPU, but the SMP preferred deadline is only two days, and the final deadline is just three days iirc.

I'm unclear why you have the line "type = 1" in your client.cfg file. I'm running SMP, and I don't have that line, in my client.cfg file. What is the purpose of that line, exactly? My Windows folder has that line (it folds big packets), but it is set to "type = 0".

When I just now tried this url: http://171.65.103.100/ , it returned an OK on an otherwise blank screen. Can you try this again, Orion?


Adak

orion456
01-04-07, 12:08 AM
Windows Task Manger says im using FAH502 twice.

I don't know how the type got in there, I just ran:

fah502-console -configonly -local

CLicking on that link again gives a blank screen again.

Adak
01-04-07, 01:43 AM
Not to worry, Orion. I'm just not used to 5.02 anymore.

That clears up some of the mystery for me. You received one of the SMP project WUnits, in error.

Some were mistakenly loaded onto Windows assignment servers about that time. Naturally, the SMP results can't be returned to a Windows collection server. The 02, 03, and 04, you spoke of, are the 4 threads that run in the SMP client - but only with the SMP core, only in Linux, and at least a dual core.

Here's what I'd like you to do:

1) Stop folding, and delete your entire FAH and it's work directories. Yes, both of the FAH's, and both of their directories. Those WU's are over-due and have already been given out to someone else and folded.

2) Please download the 5.04Beta version of the console. This is a beta with several improvements over 5.02, and it has been running very stably for over a year now.

3) Set up two FAH directories (I use FAH1 and FAH2, but whatever names you like. and start one at a time, It will go into config mode, and then all your client.cfg file contents will be the same as I (and most other folders), have.

Please let it install using big packets > 5 megs, but not as a service - just as a program running in the foreground. I'll show you some tricks to make it sweet. If you want to make it a service later, the FAH client will do that for you, just run it with the config flag " -config" or " -configonly", and it will ask if you want to install it as a service.

Keep the checkpoints at 15 minutes, and careful with your folding name and team number entry. Some have mistakenly folded for the wrong team through a misspelled name or typo on the team number.

Post up when you're done with that, and we'll take it from there,

Adak

Sorin
01-04-07, 07:18 AM
Maybe Stanford is still doing upgrades or whatever to their data center, but I finished a work unit a little while ago and that instance is trying to connect to .134 for work. It's tried 8 times and gotten a 503 error each time. I finished a work unit about an hour before that one on my same box, but it was connecting to .128 and got more work fine. Went to the F@H forums, and it seems to be normal lately. Working on their server room or network outages or some such thing.

ChasR
01-04-07, 07:33 AM
In FAH 5.02 client type=1 (folding only). type=0 is no preference, type= 2 is deadlinless and no longer valid. In FAH 5.04 type=3 is -advmethods

All work directories contain .xtc files, one per WU and have for as long as I can remember. I'd run qfix (http://home.comcast.net/~wxdude2/rph/fah.html) on the instance with the hung WUs before changing anything else.

I never had trouble with the standard connectivity until installing IE7. With it OK is a 50-50 proposition. I doubt he got a SMP WU and fail to see how you can come to that conclusion with the info provided so far. Edit: I find that to get the OK in IE7, I have to delete the last slash in the address bar and then hit Go (green arrow). Deleting it from the link doesn't work as it gets added into the address bar automatically.

Adak
01-04-07, 10:50 AM
In FAH 5.02 client type=1 (folding only). type=0 is no preference, type= 2 is deadlinless and no longer valid. In FAH 5.04 type=3 is -advmethods

All work directories contain .xtc files, one per WU and have for as long as I can remember. I'd run qfix (http://home.comcast.net/~wxdude2/rph/fah.html) on the instance with the hung WUs before changing anything else.

I never had trouble with the standard connectivity until installing IE7. With it OK is a 50-50 proposition. I doubt he got a SMP WU and fail to see how you can come to that conclusion with the info provided so far. Edit: I find that to get the OK in IE7, I have to delete the last slash in the address bar and then hit Go (green arrow). Deleting it from the link doesn't work as it gets added into the address bar automatically.

Quite right, ChasR, I just glanced in my old Windows folding directories, and they have no .xtc files whatsoever. Of course, they were stopped after completing their Windows folding, so I could fold with the SMP client, which is why they have no .xtc file. :)

The FAH forum mentioned that several SMP WU's had inadvertently been assigned to Windows user's, causing all kinds of problems. That fit in with his return of two completed WU's, less than 60 days old, for no points.

My opinion of Qfix is don't use it. It may be good for Orion, etc., but if Stanford wants their WU returned for 200,000 cpu's, then STANFORD needs to program it correctly - not ask or expect all it's users to muck around with some third party utility! (which doesn't always work, btw. Then there is still *another* third party utility that you can muck around with to try to sort out the queue's problem(s).)

No way! The leverage for the fix is all at Stanford's end, not ours. Make them fix it! It may not add to Orion's point total for today, but it WILL speed up our whole project just by making Stanford improve it's program - which will positively impact every folder.

Adak

orion456
01-12-07, 01:02 AM
.

Post up when you're done with that, and we'll take it from there,

Adak

Ok, Fah504 installed twice and running. Seems to be working fine.