• Welcome to Overclockers Forums! Join us to reply in threads, receive reduced ads, and to customize your site experience!

Help!!!

Overclockers is supported by our readers. When you click a link to make a purchase, we may earn a commission. Learn More.

Mark620

Member
Joined
Mar 29, 2003
My production has slumped recently and when I checked into it...

I am down 5 mschines right now...

I have lost a HD on a machine :cry:

and 4 of my machines are doing this: :cry: :cry: :cry: :cry:

Launch directory: Z:\mnt\FAH\237
Executable: FAHConsole.exe
Arguments: -local -forceasm

Warning:
By using the -forceasm flag, you are overriding
safeguards in the program. If you did not intend to
do this, please restart the program without -forceasm.
If work units are not completing fully (and particularly
if your machine is overclocked), then please discontinue
use of the flag.

[00:41:46] - Ask before connecting: No
[00:41:46] - User name: Mark620 (Team 32)
[00:41:46] - User ID not found locally
[00:41:46] + Requesting User ID from server
[00:41:46] + Could not connect to Primary Assignment Server for ID
[00:41:46] + Could not connect to Secondary Assignment Server for ID
[00:41:46]
+ Could not get ID from server. Retrying...
[00:41:58] + Could not connect to Primary Assignment Server for ID
[00:41:58] + Could not connect to Secondary Assignment Server for ID
[00:41:58]
+ Could not get ID from server. Retrying...
[00:42:21] + Could not connect to Primary Assignment Server for ID
[00:42:21] + Could not connect to Secondary Assignment Server for ID
[00:42:21]
+ Could not get ID from server. Retrying...
[00:42:42] + Could not connect to Primary Assignment Server for ID
[00:42:42] + Could not connect to Secondary Assignment Server for ID
[00:42:42]
+ Could not get ID from server. Retrying...
[00:43:24] + Could not connect to Primary Assignment Server for ID
[00:43:24] + Could not connect to Secondary Assignment Server for ID
[00:43:24]
+ Could not get ID from server. Retrying...
[00:44:56] + Could not connect to Primary Assignment Server for ID
[00:44:56] + Could not connect to Secondary Assignment Server for ID
[00:44:56]
+ Could not get ID from server. Retrying...
[00:47:45] + Could not connect to Primary Assignment Server for ID
[00:47:45] + Could not connect to Secondary Assignment Server for ID
[00:47:45]
+ Could not get ID from server. Retrying...
[00:53:06] + Could not connect to Primary Assignment Server for ID
[00:53:06] + Could not connect to Secondary Assignment Server for ID
[00:53:06]
+ Could not get ID from server. Retrying...
 
Replace your broken HDD. The other machines are not getting a net connection. Can you ping a website from them to confirm? This looks like Linux. Is it some kind of LTSP setup?
 
Arkaine23 said:
Replace your broken HDD. The other machines are not getting a net connection. Can you ping a website from them to confirm? This looks like Linux. Is it some kind of LTSP setup?

Yes it's Overclockix LTSP2.0

The other strange thing is that the server and 2 other machines are functioning fine. Its just these 4.

Yes, I can ping the servers....171.67.89.156 I pinged that address 44 times from the 237 machine with 0% packet loss.
 
Does this problem persist after rebooting the problem clients?

Overclockix LTSP v2.1 is available now too, BTW.
 
Yes, I have rebooted many times and reconfigured the terminal server, I have deleted the FAH directory too. Fah/237 etc...

the server, 239, 238 work fine. Its 237,236,235,234 that dont work.
They have a "VIA Rhine II" nic
and I am using the "VIA Rhine"
driver in the terminal server config.
 
1. I checked one by putting a nic in it, it does the same thing...this nic was used as "238" at one time so I know it works with stanford.

2. These CPUs were folding on other MB's so the only change is the MB and the addrerss of the client.

3. when I boot one directly with a Overclockix LTSP2.0 CD, I get the same error.


Edit:
4. I set up a win98 on one using V4 client and it worked.

WTF is wrong with the LTSP on these MOBO's???
 
Last edited:
It sounds to me like a problem with the onboard NIC. I had a board with a via rhine 2 and I could not get it online with overclockix. I even tried installing linux drivers from via on a HDD install of OCix and could not get it online. Wound up having to use PCI realtek NIC.

And yet these boxes are getting their OS through the network and can ping out. ... Can you post the /etc/resolv.conf from one of these boxes? You should be able to get it by-

as root-
ssh IP_Address less /etc/resolv.conf
 
When I typed on the box:

ssh 192.168.254.237 less /etc/resolv.conf

The authenticity of host '192.168.254.237 (192.168.254.237)' can't be established.
RSA key fingerprint b0:38:14;87:0a:65:57:9a:50:b1:d9:29:92:52:e9:db.
Are you sure you want to continue connecting (yes/no)? yes


Then when I tried:

ssh 192.168.254.237 less /etc/resolv.conf

I got:

ssh_exchange_identification: Connection closed by remote host
 
How can I redo overclockix and not loose my partially finished units??

Edit:

By redo I mean reinstall or install a newer version...it is a HD install
 
Last edited:
k will do it one step at a time then

ssh IP_Address

type yes when asked the question

less /etc/resolv.conf


You should be able to reboot a client by ssh'ing in as root and then running the command-

reboot

It should then pick up the work where it left off.

It's kind fo hard for me to diagnose the trouble since I didn't build this version of Overclockix and have not had a chance to try it out. Overdoze is the guy to ask. He stops by every few days usually....
 
Arkaine23 said:
k will do it one step at a time then

ssh IP_Address

type yes when asked the question

less /etc/resolv.conf

It's kind fo hard for me to diagnose the trouble since I didn't build this version of Overclockix and have not had a chance to try it out. Overdoze is the guy to ask. He stops by every few days usually....

I did say "YES" it made no difference. BTW I have also tried a different NIC and it did the same thing. I also have tried to use different addresses...220-190...made no difference.

Edit:
Another thing - Even though these clients are not working on folding, stanford is counting them in my processor count....
 
Last edited:
Mark620 said:
Yes, I have rebooted many times and reconfigured the terminal server, I have deleted the FAH directory too. Fah/237 etc...

the server, 239, 238 work fine. Its 237,236,235,234 that dont work.
They have a "VIA Rhine II" nic
and I am using the "VIA Rhine"
driver in the terminal server config.

The reason is the driver on overclockix_LTSP does not have the correct NIC driver for your VIA Rhine NIC. VIA provided only the windows driver and did not release the linux driver.
Notice VIA Rhine and VIA Rhine II are different revisions

You might want to use a different NIC card as overclockix does not have the correct driver for your nic. Apparently, there are 2 different NIC chipset provided by VIA and both of them are named VIA Rhine II and only the first chipset from Dlink work. The later chipset produced by VIA does not work. The next version of Knoppix release might have the VIA Rhine driver straighten out.
Best bet now is still using an external PCI NIC card and disable the VIA Rhine NIC in the bios.

EDIT: If for some reason the pump is not working because it timed out too soon. The way to fix it is when the client boot up type in
# pkill pump
# pump -v
It is in the documentations under Debug

If that fixed your problem then you will want Overclockix 2.1 which has auto pump feature if dns is not resolved the first time.
 
Last edited:
Mark620 said:
How can I redo overclockix and not loose my partially finished units??

Edit:

By redo I mean reinstall or install a newer version...it is a HD install

Use a window box you can backup the FAH directory remotely to your window box.
After reinstall, and all the client start up. Stop the folding Service on each client and copy each WU back. You should only copy work folder and file queue.dat. Then Start folding Services.
Check out the FAQ in the documents.
 
Last edited:
overdoze said:
Best bet now is still using an external PCI NIC card and disable the VIA Rhine NIC in the bios.

I did try a different nic...it did the same thing...did not disable VIA nic tho...will try it, that way.

overdoze said:
EDIT: If for some reason the pump is not working because it timed out too soon. The way to fix it is when the client boot up type in
# pkill pump
# pump -v
It is in the documentations under Debug

If that fixed your problem then you will want Overclockix 2.1 which has auto pump feature if dns is not resolved the first time.

Could I have a little more info on pump....or better yet the location of the documentation as I havent found it.
 
When you boot up the cdrom or the server (after installed) You will see the How_to_Setup_Folding_Farm.html file
Click it and search for Debug
The online How to is here
Look for LTSP instructions
 
Ok, it is the nic :(

I dont have enough room in my setup for the floppy drives. Where can I get bootrom for my nics? I have about a dozen Olicon OC-2326 nics with a boot rom bios type chip holder with no chip. I have one 3-com NIC with PXE boot but it is too tall to fit my 1/2 height case but the olicoms do fit. I found a PXE network card but for the price I just might as well replace the MOBOs.

Thanks for the help.
 
overdoze said:
When you boot up the cdrom or the server (after installed) You will see the How_to_Setup_Folding_Farm.html file
Click it and search for Debug
The online How to is here
Look for LTSP instructions

I have seen the "How_to_Setup_Folding_Farm.html file" I was looking for Linux instructions.
 
We can't cover every possible situation in the docs, but there's plenty of knowledgeable linux folks here at ocforums. Just tell us what you want to do and we can help, or post your question in the Alt OS forum where all the linux gurus hang out.
 
Back