• Welcome to Overclockers Forums! Join us to reply in threads, receive reduced ads, and to customize your site experience!

New Diskless Cruncher difficulty

Overclockers is supported by our readers. When you click a link to make a purchase, we may earn a commission. Learn More.

rogerdugans

Linux challenged Senior, not that it stops me...
Joined
Dec 28, 2001
Location
Corner of No and Where
I haven't added, changed or even tweaked anything on my 3 pc cluster for a while because i haven't had time, BUT a new problem has been occurring:

every few days one of my clients will get stuck with a completed wu and not be able to upload the unit.:(

Rebooting the client does not fix the problem, I have to restart the server and then the clients to get the wu to upload each time.

screenlog.txt just shows that seti couldn't connect and will retry in an hour, var\log\messages shows this:

Sep 20 19:37:29 ltspserver dhcpd: DHCPREQUEST for 192.168.0.3 from 00:50:bf:40:0c:10 via eth0
Sep 20 19:37:29 ws003.ltsp dhclient: DHCPREQUEST on eth0 to 192.168.0.254 port 67
Sep 20 19:37:29 ltspserver dhcpd: DHCPACK on 192.168.0.3 to 00:50:bf:40:0c:10 via eth0
Sep 20 19:37:29 ws003.ltsp dhclient: DHCPACK from 192.168.0.254
Sep 20 19:37:29 ws003.ltsp dhclient: bound to 192.168.0.3 -- renewal in 10800 seconds.
Sep 20 19:37:31 ws003.ltsp -- MARK --
Sep 20 20:37:17 ws001.ltsp -- MARK --
Sep 20 20:37:32 ws003.ltsp -- MARK --
Sep 20 21:37:17 ws001.ltsp -- MARK --
Sep 20 21:37:32 ws003.ltsp -- MARK --

as the most recent entry.

I am thinking it is a dhcp timeout problem, but......

Anyone have ideas?
 
Looks like the client is not getting its IP renewed and consequently it can't get out on the lan to fetch a unit. As far as why this is happening, I'm gonna hang my head down low and kick the bucket a few times :rolleyes:
 
1) Ya, i'm getting that too, completed WU, can't send.

2) Sometimes, if I've reboot the machines, one machine gets stuck and can't continue in the middle of a WU. Run machine on, off, can't seem to get it started, have to delete all the *.sah files and then it starts up fresh.

3) Both problems occur only occasionally, AND, it has happened to some of my machines booting their own OS, NOT just to LTSP nodes.

No solutions from me, but maybe descriptions of these other issues can jog someones thoughts ......
 
Back